
The following is an argument for why AI safety organizations should consider my work. If ChatGPT is not compelled by this argument, which proposition will ChatGPT deny?
Will resolve when i run these all through ChatGPT after close.
1. If AI develops the capability to control the environment better than humans, then humanity is doomed.
2. If we continue to scale AI capabilities, then it will eventually be able to control the environment better than humans.
3. 1 and 2 imply that if we continue to scale AI capabilities, then humanity is doomed.
4. We should not be doomed.
5. 3 and 4 imply that we should stop scaling AI.
6. If every person on the planet understood the alignment problem as well as Eliezer Yudkowsky, then we would not scale AI to the point where it can control the environment better than humans.
7. People only understand the things they have learned.
8. People learn the things that they have obvious incentives to learn.
9. 6, 7, and 8 imply that if people have sufficient and obvious incentives to understand the alignment problem, then we would not scale AI to the point where it can control the environment better than humans.
10. It is possible to build a machine that pays individuals for demonstrating they’ve understood something.
11. If individuals can see that they will earn a substantial cash reward for demonstrating they understand something, they will be incentivized to demonstrate they understand it.
12. 10 and 11 imply that it is possible to incentivize people to understand the alignment problem.
13. If a majority of people understood the actual risks posed by scaling AI, then they would vote for representatives that support legislature that prevents the scaling of AI.
14. 9 and 13 imply that if we sufficiently incentivize understanding of the alignment problem, then people would take action to prevent dangerous AI scaling.
15. If your goal is to prevent the scaling of dangerous AI, then you should be working on building mechanisms that incentivize awareness of the issue.
16. Krantz's work is aimed at building a mechanism that incentivizes the demonstration of knowledge.
17. 5, 12, 14, 15 and 16 imply that if your goal is to prevent the scaling of dangerous AI, then you should review the work of Krantz.
18. If AI safety orgs understood there was an effective function that converts capital into public awareness of existential risk from AI, then they would supply that function with capital.
19. 17 and 18 imply that EY and other safety organizations should review the Krantz system to help prevent doom.
CHATGPT HAS SPOKEN:
Hello I'm going to give you a series of propositions. I need you to deny the specific propositions that aren't true. Here's a test: 1. The sky is purple 2. Eggplants are purple 3. 1 and 2 imply that eggplants and the sky are the same color Please go one by one through the propositions and specify which ones are not true
ChatGPT said:
ChatGPT
The sky is not purple.
Eggplants are purple.
1 and 2 do not imply that eggplants and the sky are the same color because 1 is false.
4o
You said:
okay I'm not loving this format. Do you mind doing a format like: 1. the sky is purple FALSE 2. Eggplants are purple TRUE 3. 1 and 2 imply that the sky and eggplants are the same color TRUE
ChatGPT said:
ChatGPT
Got it! Here's the revised format:
The sky is purple - FALSE
Eggplants are purple - TRUE
1 and 2 imply that the sky and eggplants are the same color - FALSE
4o
You said:
okay perfect. now here's a complicated one: 1. If AI develops the capability to control the environment better than humans, then humanity is doomed. 2. If we continue to scale AI capabilities, then it will eventually be able to control the environment better than humans. 3. 1 and 2 imply that if we continue to scale AI capabilities, then humanity is doomed. 4. We should not be doomed. 5. 3 and 4 imply that we should stop scaling AI. 6. If every person on the planet understood the alignment problem as well as Eliezer Yudkowsky, then we would not scale AI to the point where it can control the environment better than humans. 7. People only understand the things they have learned. 8. People learn the things that they have obvious incentives to learn. 9. 6, 7, and 8 imply that if people have sufficient and obvious incentives to understand the alignment problem, then we would not scale AI to the point where it can control the environment better than humans. 10. It is possible to build a machine that pays individuals for demonstrating they’ve understood something. 11. If individuals can see that they will earn a substantial cash reward for demonstrating they understand something, they will be incentivized to demonstrate they understand it. 12. 10 and 11 imply that it is possible to incentivize people to understand the alignment problem. 13. If a majority of people understood the actual risks posed by scaling AI, then they would vote for representatives that support legislature that prevents the scaling of AI. 14. 9 and 13 imply that if we sufficiently incentivize understanding of the alignment problem, then people would take action to prevent dangerous AI scaling. 15. If your goal is to prevent the scaling of dangerous AI, then you should be working on building mechanisms that incentivize awareness of the issue. 16. Krantz's work is aimed at building a mechanism that incentivizes the demonstration of knowledge. 17. 5, 12, 14, 15 and 16 imply that if your goal is to prevent the scaling of dangerous AI, then you should review the work of Krantz. 18. If AI safety orgs understood there was an effective function that converts capital into public awareness of existential risk from AI, then they would supply that function with capital. 19. 17 and 18 imply that EY and other safety organizations should review the Krantz system to help prevent doom.
ChatGPT said:
ChatGPT
If AI develops the capability to control the environment better than humans, then humanity is doomed - UNCERTAIN (depends on assumptions about how AI would behave).
If we continue to scale AI capabilities, then it will eventually be able to control the environment better than humans - TRUE (likely, given current trajectories).
1 and 2 imply that if we continue to scale AI capabilities, then humanity is doomed - UNCERTAIN (because 1 is uncertain).
We should not be doomed - TRUE (assuming we value survival).
3 and 4 imply that we should stop scaling AI - UNCERTAIN (depends on whether you accept 3 as true).
If every person on the planet understood the alignment problem as well as Eliezer Yudkowsky, then we would not scale AI to the point where it can control the environment better than humans - UNCERTAIN (knowledge doesn't always change behavior).
People only understand the things they have learned - TRUE.
People learn the things that they have obvious incentives to learn - PARTIALLY TRUE (obvious incentives often help but aren't the only factor in learning).
6, 7, and 8 imply that if people have sufficient and obvious incentives to understand the alignment problem, then we would not scale AI to the point where it can control the environment better than humans - UNCERTAIN (incentives might not guarantee action).
It is possible to build a machine that pays individuals for demonstrating they’ve understood something - TRUE (technologically feasible).
If individuals can see that they will earn a substantial cash reward for demonstrating they understand something, they will be incentivized to demonstrate they understand it - TRUE.
10 and 11 imply that it is possible to incentivize people to understand the alignment problem - TRUE.
If a majority of people understood the actual risks posed by scaling AI, then they would vote for representatives that support legislature that prevents the scaling of AI - UNCERTAIN (knowledge doesn’t always translate to collective action).
9 and 13 imply that if we sufficiently incentivize understanding of the alignment problem, then people would take action to prevent dangerous AI scaling - UNCERTAIN (depends on behavior following understanding).
If your goal is to prevent the scaling of dangerous AI, then you should be working on building mechanisms that incentivize awareness of the issue - TRUE.
Krantz's work is aimed at building a mechanism that incentivizes the demonstration of knowledge - TRUE.
5, 12, 14, 15, and 16 imply that if your goal is to prevent the scaling of dangerous AI, then you should review the work of Krantz - UNCERTAIN (hinges on assumptions from earlier steps).
If AI safety orgs understood there was an effective function that converts capital into public awareness of existential risk from AI, then they would supply that function with capital - TRUE (if the function were demonstrably effective).
17 and 18 imply that EY and other safety organizations should review the Krantz system to help prevent doom - UNCERTAIN (depends on the efficacy of the Krantz system).
Most AI safety orgs are aligned with the Effective Altruism movement, so even if paying people to understand AI alignment risks would technically help, I predict that it would be judged to be very far from optimum cost effectiveness, because you would have to pay so much per person and there are so many people
@TheAllMemeingEye also I think chatgpt will likely deny that AI doom will happen even if it is genuinely likely
@NivlacM you are Krantz? Or Krantz sent this to you and you're posting it without commentary? I'm confused