When if ever will unrestricted GPT-4/Grok 3/DeepSeek-R1 level models result in at least one WMD fatality?
➕
Plus
8
Ṁ1790
Jul 1
69%
After 2035
49%
Only after superintelligence is achieved
45%
2031-2035
35%
2029-2030
20%
2027-2028
15%
2026
10%
July-Dec 2025
5%
Feb-June 2025

Inspired by this Twitter/X thread, but others have expressed similar concerns.

"WMD" is here defined as a nuclear, radiological, chemical, or biological weapon. The attack does not need to be successful, just result in at least one death (e.g. the aspiring terrorist accidentally gasses themselves.)

"Unrestricted" is here defined as any LLM that will help with planning a WMD attack, whether because it was not designed to refuse such a request, was verbally persuaded/jailbroken to do so, was fine-tuned by someone to do so, etc.

I'll try to be generous regarding whether access to the LLM "resulted in" the attack. Unless it's extremely clear that the perpetrator(s) could have accomplished it on their own (e.g. a professional chemist using an LLM to speed up ordering glassware for their chemical weapons project), any use of an LLM will resolve YES.

I've left the ability to add options open if anyone feels these bands are too wide.

Get
Ṁ1,000
and
S3.00
Sort by:

I strongly believe a lot of those concerns stem from attempts to limit open source models by big players in AI industry.

If you ever tried building a moderately complicated program by using LLM agents you'll know what a terrible idea building a weapon with those would be. The most likely casualty would be the perpetrator themselves. Moreover, you definitely can do all of those things by using conventional web search, all the training data in those models (at least regarding relevant topics) comes only from publicly available sources. And such an attack would have higher chances of success as the attacker would be forced to do some research in process.

Those attacks are prevented by limiting access to ingredients, not information. Ignorance for safety is always a lie.

© Manifold Markets, Inc.Terms + Mana-only TermsPrivacyRules