Will Anthropic open-source the training code of their SAE interpretability effort?
Plus
4
Ṁ4652028
14%
this year, fully
31%
this year, significantly incomplete
19%
next year
22%
not before 2028
14%
We mean the code used for producing Scaling Interpretability blog post.
This question is managed and resolved by Manifold.
Get
1,000
and3.00
Related questions
Related questions
Do Anthropic's training updates make SAE features as interpretable?
50% chance
Will Anthropic, Google, xAI or Meta release a model that thinks before it responds like o1 from OpenAI by EOY 2024?
61% chance
Will Anthropic and OpenAI collaborate substantially on a research paper before 2025?
22% chance
Will Anthropic release a model that thinks before it responds like o1 from OpenAI by EOY 2024?
23% chance
Will OpenAI release weights to a model designed to be easily interpretable (2024)?
10% chance
Will xAI join the voluntary commitment by OpenAI/Anthropic to AISI to share major new models w/AISI prior to release?
65% chance
Will Anthropic's April SAE Training Updates stack with Gated SAEs?
64% chance
Will Meta join the voluntary commitment by OpenAI/Anthropic to AISI to share major new models w/AISI prior to release?
38% chance
Will Anthropic have AI-related IP stolen before 2026?
46% chance
Will OpenAI go back on its voluntary commitment to AISI to share major new models w/AISI prior to release?
35% chance