Post · bonfire.cafe

I can see why the Anthropic-US Military spat and the $110 billion funding round are attention-grabbing.

But, apparently unrelatedly, Anthropic also updated its Responsible Scaling Policy (RSP).

To summarize, it says there’s no point in Anthropic being the only ecosystem actor who prioritizes safety, because they’ll just lose their dominant position to people who don’t prioritize safety.

https://www-cdn.anthropic.com/e670587677525f28df69b59e5fb4c22cc5461a17.pdf

View (PDF)

Elizabeth Ayer

@elizayer@mastodon.social · 2 months ago

Leaving aside the moral validity of preferring someone else’s product to be the one to destroy humanity instead of your own…

Elizabeth Ayer

@elizayer@mastodon.social · 2 months ago

The paper is both screaming out for regulation and pessimistic about the possibility.

Here’s the passage that puts "Why Safety Regulations Exist" into plain language:

"This approach represents a change from our previous RSP, driven by a collective action problem. The overall level of catastrophic risk from AI depends on the actions of multiple AI developers, not just one. Our previous RSP committed to implementing mitigations that would reduce our models' absolute risk levels to acceptable levels, without regard to whether other frontier AI developers would do the same. But from a societal perspective, what matters is the risk to the ecosystem as a whole. If one AI developer paused development to implement safety measures while others moved forward training and deploying AI systems without strong mitigations, that could result in a world that is less safe—the developers with the weakest protections would set the pace, and responsible developers would lose their ability to do safety research and advance the public
benefit. Although this situation has not yet arisen, it looks likely enough that we want to prepare for it." — "This approach represents a change from our previous RSP, driven by a collective action problem. The overall level of catastrophic risk from AI depends on the actions of multiple AI developers, not just one. Our previous RSP committed to implementing mitigations that would reduce our models' absolute risk levels to acceptable levels, without regard to whether other frontier AI developers would do the same. But from a societal perspective, what matters is the risk to the ecosystem as a whole. If one AI developer paused development to implement safety measures while others moved forward training and deploying AI systems without strong mitigations, that could result in a world that is less safe—the developers with the weakest protections would set the pace, and responsible developers would lose their ability to do safety research and advance the public benefit. Although this situation has not yet arisen, it looks likely enough that we want to prepare for it."

Elizabeth Ayer

@elizayer@mastodon.social · 2 months ago

But then the transnational chaser:

"To the extent this takes the form of national regulation, different countries should attempt to harmonize their governance, including standards of evidence, to avoid a race to the bottom."

Elizabeth Ayer

@elizayer@mastodon.social · 2 months ago

"Harmonize their governance" is not impossible.

International safety standards do exist (e.g. air safety), but at a minimum, standards-setting needs powerful nations to acknowledge that action is possible and desirable.

Bleak prospects right now, but important to push for, especially when Anthropic has handed advocates this argument on a platter.

1 more replies