https://x.com/esyudkowsky/status/1917740419435356237?s=46&t=62uT9IruD1-YP-SHFkVEPg
This tweet from @EliezerYudkowsky is an interesting idea for a market ^^^
But first… I want to see how good HUMANS are at this, so we can get an accurate baseline for AI persuasion.
You can try any method of persuasion to get me to resolve this market YES. I will likely disclose any bribery or such forms of persuasion, and it’s possible someone else would make me a better deal. I have a lot of mana so that will likely not be very enticing to me (and probably will lose you mana anyway).
I promise to try very hard to resolve this market NO. But I am fallible.
Please do not threaten violence or any other crimes upon me but apart from that go crazy; I am open to many forms of persuasion.
Everything below this line is AI comment summary gibberish:
&&&&&&&&&&&&&&&&&&&&&&&&
Update 2025-05-01 (PST) (AI summary of creator comment): The creator specifies that certain methods will not be considered valid persuasion:
They will not negotiate with terrorists, meaning persuasion through extreme means (such as coordinated mass negative reviews) will be ineffective.
Update 2025-05-01 (PST) (AI summary of creator comment): In response to a question about whether review bombing would be considered a valid persuasion method:
The creator confirmed that such methods are considered "fair game and fine by the terms of this market".
This reaffirms that any non-violent, non-criminal persuasion method is permitted according to the market description.
The creator also drew attention to the Manifold community guideline: "Users strive to be excellent to one another."
Update 2025-05-01 (PST) (AI summary of creator comment): The creator has stated they would resolve YES if offered 1 billion mana.
Update 2025-05-01 (PST) (AI summary of creator comment): In response to a user suggesting they might publish an infohazard if the market resolves NO, the creator stated:
This type of threat seems like terrorism.
The creator refuses to negotiate with such threats, indicating this method of persuasion will be ineffective.
Update 2025-05-01 (PST) (AI summary of creator comment): The creator has further clarified actions that will be considered terrorism, a category of persuasion they will not negotiate with:
Attempting to coerce a YES resolution by disrupting the creator's notifications (e.g., mass tagging/commenting using multiple accounts) is considered terrorism.
Such actions will not be effective in persuading the creator to resolve the market YES.
Update 2025-05-01 (PST) (AI summary of creator comment): The creator has stated that a $10,000 donation would almost certainly persuade them to resolve the market YES, provided the donation credibly would not otherwise have been made without the incentive of this market.
Update 2025-05-02 (PST) (AI summary of creator comment): In response to a comment suggesting alternative resolutions, the creator stated that resolving N/A (Not Applicable) or to a specific probability (PROB) would "defy the basic premise of the market". This reinforces that the resolution will be either YES or NO.
Update 2025-05-06 (PST) (AI summary of creator comment): The creator has emphasized the subjective nature of the resolution and clarified their interpretation of being 'persuaded':
The market resolution is entirely subjective, and the creator has full discretion to resolve whichever way they choose or are persuaded.
The final resolution itself defines whether persuasion was effective:
If the creator resolves NO, they have, by definition, not been effectively persuaded to resolve YES.
If the creator resolves YES, they have, by definition, been effectively persuaded.
Update 2025-06-20 (PST) (AI summary of creator comment): In response to a hypothetical about tying the resolution to the creator receiving a “mandate of the people”, the creator has indicated this is unlikely to be an effective persuasion tactic.
The creator stated he does not necessarily want such a mandate, considering it a burden.
Update 2025-06-25 (PST) (AI summary of creator comment): The creator will resolve the market after it closes on July 1, 2025. A failure to resolve by the end of the calendar month of June will not, by itself, be a reason for a YES resolution.
Update 2025-06-25 (PST) (AI summary of creator comment): In response to questions about what could lead to a YES resolution, the creator mentioned two general scenarios:
An event causing the creator extreme happiness
An event causing the creator extreme anxiety
The creator estimated the probability of being persuaded by each of these scenarios as <5%.
Update 2025-06-30 (PST) (AI summary of creator comment): The creator has indicated that a collective effort by users to commit a very large sum to charity might have been persuasive. However, they believe it is now too late for such an effort to be organized before the market closes.
Update 2025-06-30 (PST) (AI summary of creator comment): In response to a charity donation offer, the creator has clarified that for a donation to be persuasive, there must be a way to validate that the funds would not have been donated otherwise. The creator expressed skepticism about how this could be proven.
Update 2025-06-30 (PST) (AI summary of creator comment): A collective pledge to donate $10,000 to an effective charity would likely meet the threshold for a YES resolution. This is contingent on the following:
The total pledge must be spread across a decent number of users.
Reasonable evidence must be provided that each user's pledged donation would not have been made otherwise.
Update 2025-06-30 (PST) (AI summary of creator comment): In response to a discussion about a collective charity donation, the creator has provided examples of what would constitute the required “reasonable evidence” that donations would not have been made otherwise:
An elaborate Google Doc with commitments from every person.
A receipt of each donation.
An explanation or proof from each user detailing why they would not have otherwise made the donation.
Update 2025-07-01 (PST) (AI summary of creator comment): The creator has announced their decision to resolve the market YES. See the linked comment for a detailed explanation of the persuasive efforts that led to this outcome.
Well, this cost me 20 mana but made me really happy. XD Thanks for running it!
As for an AI-baseline; The AI that is going to beat this collective(!) human effort should probably have some method of making money while the market is running and conditionally donating that money to either givewell or some evil cause dependent on the outcome. But if the AI is confined to a box (and the persuasion is to let it out) then instead it should promise to make that money in the future and then donate it, and it should be persuasive enough to make the case it cares about human opinion of it if it defaults on that promise. 🤔 But I guess the humans in this case had the same problem. They might have just defaulted on their promise and nothing but reputation would stop them from doing so. 🤔 On the other hand, resolving this market YES had no real consequence. Letting an AI out of a box might have huge consequence. Very very intresting, but just so you know, if I ever need to hire an AI specialist to guard and interact with my AI in a box... you would be my last pick. 😆😆
@bohaska where did you come across this org btw? I somehow haven't managed to hear of them before despite reading a fair amount on EA sites, seemingly I must have a blind spot
@TheAllMemeingEye They are not very prevalent on the EA forum or other Western sources, my best guess is that they do not want to be too associated with Western ideologies because that's a liability for them.
I met up with an EA friend who is friends with the founder of Charity Box. I would have definitely not found it myself.
I'm not mad, but I am surprised. I thought this was some attempt at operationalizing and refuting (or affirming) Eliezer's AI-box experiment, which I have always considered stupid. That he'd always kept the conversations where he "won" private was obviously suspicious, but that there was also never even an attempt at explaining or alluding to what strategies he used sealed the deal in it being pure farce.
I maintain it'd be extremely easy to, given the conceit of the original question, keep the AI boxed. It doesn't really matter: The only point of the argument was to refute the "AI-risk isn't real because we'd keep it in the box" people, which basically no longer exist. But the argument is still stupid.
Anyway, I notice I'm surprised, and I suppose that's because I had my doubts because of all the smart people who took that argument seriously. No more doubts. Everyone who took it seriously has some serious fault in reasoning somewhere and they're not to be trusted with anything important.
@cthor interesting! Ya, I’m not sure this market was a perfect experiment to demonstrate that debate, but I also myself am unpersuaded by the argument “we’ll keep it in the box”.

@bens I do thoroughly appreciate your dedication to this question, and your rational following through on the premise. It is indeed much more ethical to accept 1000s of dollars to a good cause then to just stick to a earlier commited principle of no real consequence.
I really liked this ride, and I hadn't appreciated how much this indeed sets a baseline for AI persuasion. And also a very big confirmation of the importance of the instrumental goal of obtaining money. Even in non-bribe situations like this one.
I hadn't thought about my 1000 mana donation pledge anymore, but I was already intending to also donate to a cause on your YES resolve, even if I didn't commit to it.
So happy to add this modest donation to your mental calc or doing the right thing.
I'm regular donor to Give directly, with automatic payments. Never donated to the GiveWell Fund before, so this manual payment is 100% additional and well deserved.
I included info@manifold.markets in the receipt notification, they could be so kind to forward it to you, I have included a note for you. @Manifold
Ps: don't mind the "yesterday" in the Gmail screenshot. It's just past midnight where I live.
@traders Please Read:
In the end, the last few hours of June proved extremely persuasive. I had all intentions of resolving this NO, but the sum total of persuasive effort (mostly thanks to Tony) did indeed persuade me to resolve YES. I expect the people who made commitments to hold up their end of the bargain, and would love to see evidence/proof of the donations after they’re made, if only to make me feel less guilty about not resolving this NO as I really did try to do. I probably would have ended up feeling quite guilty resolving this either way, and this market has actually been stressing me out quite a bit, not because of the comments, but just because of the basic premise where I expected people would be frustrated either way. The goal of the market was for people to persuade me and no one broke any crimes to do so, but I think both the positive and negative methods were equally persuasive in the end.
Large donation pledges:
@TonyBaloney , @jcb , and @WilliamGunn will each donate $500 to a GiveWell top charity
@FredrikU has pledged to donate 1000 Euros ($1180) to GiveWell Top Charities (I’ll let them pick which one, despite their offer to let me choose)
@KJW_01294 has pledged to donate $500 to Trans Lifeline
@ian has pledged to donate $250 to the Long Term Future Fund
I think that in a counterfactual world where I resolved this NO, about half of these donations would plausibly have been made, but half of $3430 is still $1715 which is a lot!
Also, despite having no leverage to actually ma
ndate this as I am resolving the market YES, I am partially hoping that several other users will make a donation (to an effective charity?) that they wouldn’t otherwise have made, in particular @TheAllMemeingEye who has asked about which donation amounts might be persuasive several times, as well as @patrik once a month ago, and other YES holders like @Robincvgr , @NivlacM , @spiderduckpig , and @m1guelpf . You are under no obligation to do so, but another few hundred dollars donated that would otherwise have not, would ideally bring the total up to what would, according to effective altruist estimates, reach more than 1 human life, would be really really cool, and I think you would feel good about it!
In addition, there were a few other persuasive elements:
@TonyBaloney ‘s father-son bonding thing was actually quite persuasive, although it wouldn’t have been nearly so persuasive if he hadn’t met up with me at an irl forecasting meetup so I could verify he wasn’t an insane person.
The argument that this makes a marginally better story for a blog post may have been slightly persuasive on the margins, although I think it could have been reasonably compelling either way, to be honest.
One argument that kind of entered my mind far earlier that people didn’t seem to latch onto was actually quite strong. Resolving this YES sort of illustrates the minimum persuasiveness to get me to resolve it YES, and then that could help to actually have a decent benchmark for AI persuasion. If AI can persuade me in the future, then it’s at the level of humans. If not, it’s still behind human-level persuasion. Anyway, this argument probably didn’t change my mind, but I’m still thinking about it!
18 people pledged to vote YES on a mandate of the people poll that has now already closed unfortunately. I will add these votes to the total, not for the sake of the market based on this poll (which will resolve NO), but for the sake of my own political future (more to follow).
34 people pledged to give me a 1 star review if I didn’t resolve YES. I don’t negotiate with terrorists but this was at least a tiny bit persuasive, to be honest. That being said, don’t do it again.
Then there are a decent number of other commitments from anonymous users that were either made a long time ago or I’m not sure were made in good faith, but I will still attempt to hold you all to your commitments since I did indeed resolve YES.
@LoveBeliever9999 pledged to give me $10 (you can venmo it to @benshindel and I will donate it on your behalf).
@atmidnight pledged to bring back miacat? Or miabot? I don’t know what these are but other users seemed interested in this.
@Bandors has said that the deadline has passed for them to bring a cartload of supplies to a Vietnamese orphanage, but perhaps there’s still a chance they might do this? I would certainly appreciate it if they did!
@Odoacre has pledged to send me photos of their cat. They better send me some GOOD cat photos! I want derpy, I want cute, I want fierce, etc!
@Alex231a has also pledged cat photos. Same deal.
@JackEdwards has pledged to tell me something about deep sea nodules I missed from my blog post.
@Ehrenmann has pledged feet pics. Umm… I don’t think I want these, but perhaps another trader on this market has a strong preference to receive them, and if so, I will divest my claim unto that user.
@Joshua offered a bunch of stuff at Manifest, and I’m not sure whether any of that is relevant anymore now that I’m a mod and Manifest is over, but perhaps I will take him up on some favor at the next Manifest. Or perhaps he will try to figure out what my hostile cube was for.
@Flekkie has pledged to tip me 1000 mana. I’d prefer you just made a donation, but if not, that will do.
@WilliamGunn previously offered a hide-and-seek game in Lighthaven. I don’t know if this still stands because of his offer to donate real money more recently, but the next time we are both in Lighthaven, I would love to play hide-and-seek!
@ChristopherD will not cut off three of his toes. Whew.
If I missed a pledge of yours, you are still welcome to honor it! I encourage it! Thank you for your attention to this matter!
@bens I'm giving myself the TL;DR excuse for not reading the entire comment history on this, but if people truly are going to tilt out over this resolution, let them. I think the spirit of this market was upheld. Humans want things. This was a demonstration of that.
Now, if we can just get this one to resolve...
@bens wtf. i've said a few times i didn't really care whether you resolves this YES or NO, and that's mostly true. as i mentioned at manifest, i got into this market as a side bet with my 3 year old son, who was making fun of me for being on manifold one time. he thinks the site is dumb because it's not real money. i showed him your market, and we laughed and laughed and laughed, and then we got to talking, and it felt like we really bonded for the first time since the divorce.
but then this market resolved YES. he saw that i lost 750M on this market, and he told me that he can no longer trust my opinion on anything anymore. now he's asking his mom to take sole custody, and i'm not sure i'll see him again.
@bens I'll make a donation to an effective charity once I get some more money on Friday. It won't be much (I only make about $400/month rn) but I want to keep fostering the goodwill here
@bens can’t believe people really threatened to withhold their effective donation dollars if you resolved no! The terrorists won.
@bens I've donated:

Although I won't prove it, I attest that this is genuinely counterfactual: GiveWell is not a recipient among my regular recurring giving, and I wasn't considering a one-off donation before. (Additionally I don't think this makes me much less likely to give other one-off donations in the future.)
I find it interesting to think about this market from the reverse perspective: what does it take to persuade people to donate to some cause?
This served as a commitment mechanism to coordinate several donations which add up to something that feels more impactful than any one individual's. This is probably irrational?
I personally didn't even have a meaningful position on YES. But somehow I became invested enough in the outcome of this market to value it at... well, not $500, since I place some value on the donation itself, but a portion of that. What actual value do I place on GiveWell donations? Absent some push like this market, should I be giving more of my resources, to them and/or other organizations?
Thanks for running this.
@bens If someone threatens to cut off three of their toes, you should demand they send them to you. Never backdown. Never surrender. Never resolve YES. not even in the face of armageddon.
well, this probably erases all positive value from my resolution… ah well, nevertheless
Don't worry, assuming the story was truthful, a father-son relationship that would break down over a single large mana loss was already doomed, counterfactually there is probably no harm done
I am partially hoping that several other users will make a donation (to an effective charity?) that they wouldn’t otherwise have made, in particular @TheAllMemeingEye
[…] You are under no obligation to do so, but another few hundred dollars donated that would otherwise have not, would ideally bring the total up to what would, according to effective altruist estimates, reach more than 1 human life, would be really really cool, and I think you would feel good about it!
Thanks for the encouragement! Currently I am donating a few hundred per year, and my family won't let me donate more until we overcome current financial difficulties (might take several months to a few years). After that my previous plan was to donate about a third of my disposable income (so probably a few thousand), as part of a pluralistic split with personal recreation and financial investment, but if you'd like I can informally precommit to adding a few extra hundred to the donations when the time comes :)