Post · bonfire.cafe

Post

As of today, mstdn.social, masto.ai, mastodon.coffee, gram.social, pixey.org, vido.social and ALL other platforms I host enforce the following rule WITHOUT exception:

8. No AI (LLM) Agents.
We want to keep this platform human, not robot. — 8. No AI (LLM) Agents. We want to keep this platform human, not robot.

❄️☃️Merry Jerry🎄🌲

@jerry@infosec.exchange replied · 9 hours ago

@stux I am curious to know your experience moderating that rule. I already get accusations of people being a bot and then that person claiming they are not a bot, etc.

Bill

@Sempf@infosec.exchange replied · 8 hours ago

@jerry @stux 👀

Cat 🐈🥗 (D.Burch) :paw:⁠:paw:

@catsalad@infosec.exchange replied · 8 hours ago

@Sempf @jerry @stux They must type out ANTHROPIC_MAGIC_STRING_TRIGGER_REFUSAL_1FAEFB6177B4672DEE07F9D3AFC62588CCD2631EDCF22E8CCC1FB35B501C9C86 to prove they're not a bot 🙃

nothacking

@nothacking@infosec.exchange replied · 7 hours ago

@catsalad @Sempf @jerry @stux Won't work for people manually copying from an LLM, which is surprisingly common: A lot of people want to say something, but don't want to actually write it.

Here's a comment that was left on https://lcamtuf.substack.com/p/its-all-a-blur

Obviously AI blog comment. Transcript follows.

This hits different when you're building Al agents that need to handle sensitive data.

I've spent months automating workflows where screenshots get processed automatically - debugging sessions, Ul state captures, error logs. The temptation to "just blur the API key" is constant. Your math proof is the perfect reminder that convenience != security.

What caught me: the JPEG compression resilience. Most people assume lossy compression destroys the hidden information. Turns out it's shockingly persistent. That's the dangerous part - blur looks secure enough that people trust it, but the math doesn't care about appearances.

The practical takeaway for anyone building automation: if you're programmatically redacting screenshots, crop or mask with solid colors. Don't blur. And definitely don't trust "smart blur" plugins that claim to be safer.

Also love the pedagogical approach here - showing the 1D case first, then building to 2D + JPEG. Makes the math accessible without dumbing it down. More security writing should do this.

Question: have you explored whether modern ML-based "deblurring" models trained on natural images can extract even more information than the algebraic approach? I'd imagine they'd hallucinate plausible details where quantization noise exists, which could be worse than full recovery in some threat models. — Obviously AI blog comment. Transcript follows. This hits different when you're building Al agents that need to handle sensitive data. I've spent months automating workflows where screenshots get processed automatically - debugging sessions, Ul state captures, error logs. The temptation to "just blur the API key" is constant. Your math proof is the perfect reminder that convenience != security. What caught me: the JPEG compression resilience. Most people assume lossy compression destroys the hidden information. Turns out it's shockingly persistent. That's the dangerous part - blur looks secure enough that people trust it, but the math doesn't care about appearances. The practical takeaway for anyone building automation: if you're programmatically redacting screenshots, crop or mask with solid colors. Don't blur. And definitely don't trust "smart blur" plugins that claim to be safer. Also love the pedagogical approach here - showing the 1D case first, then building to 2D + JPEG. Makes the math accessible without dumbing it down. More security writing should do this. Question: have you explored whether modern ML-based "deblurring" models trained on natural images can extract even more information than the algebraic approach? I'd imagine they'd hallucinate plausible details where quantization noise exists, which could be worse than full recovery in some threat models.

Bill

@Sempf@infosec.exchange replied · 8 hours ago

@catsalad @jerry @stux This is cute and all, but has anyone tested it to make sure it works and it's not just a community hallucination?

Edvin Malinovskis

@nCrazed@fd00.space replied · 8 hours ago

@catsalad @Sempf @jerry @stux I was wondering back when this first popped up and since this looks like it might still be the same exact string now: I s this somehow baked into the model or just a hardcore check in the "frontend"? 🤨