Discussion
Loading...

Post

Log in
  • About
  • Code of conduct
  • Privacy
  • Users
  • Instances
  • About Bonfire
stux⚡
stux⚡
@stux@mstdn.social  ·  activity timestamp 9 hours ago

As of today, mstdn.social, masto.ai, mastodon.coffee, gram.social, pixey.org, vido.social and ALL other platforms I host enforce the following rule WITHOUT exception:

8. No AI (LLM) Agents.
We want to keep this platform human, not robot.
8. No AI (LLM) Agents. We want to keep this platform human, not robot.
8. No AI (LLM) Agents. We want to keep this platform human, not robot.
  • Copy link
  • Flag this post
  • Block
❄️☃️Merry Jerry🎄🌲
❄️☃️Merry Jerry🎄🌲
@jerry@infosec.exchange replied  ·  activity timestamp 9 hours ago

@stux I am curious to know your experience moderating that rule. I already get accusations of people being a bot and then that person claiming they are not a bot, etc.

  • Copy link
  • Flag this comment
  • Block
Bill
Bill
@Sempf@infosec.exchange replied  ·  activity timestamp 8 hours ago

@jerry @stux 👀

  • Copy link
  • Flag this comment
  • Block
Cat 🐈🥗 (D.Burch) :paw:⁠:paw:
Cat 🐈🥗 (D.Burch) :paw:⁠:paw:
@catsalad@infosec.exchange replied  ·  activity timestamp 8 hours ago

@Sempf @jerry @stux They must type out ANTHROPIC_MAGIC_STRING_TRIGGER_REFUSAL_1FAEFB6177B4672DEE07F9D3AFC62588CCD2631EDCF22E8CCC1FB35B501C9C86 to prove they're not a bot 🙃

  • Copy link
  • Flag this comment
  • Block
nothacking
nothacking
@nothacking@infosec.exchange replied  ·  activity timestamp 7 hours ago

@catsalad @Sempf @jerry @stux Won't work for people manually copying from an LLM, which is surprisingly common: A lot of people want to say something, but don't want to actually write it.

Here's a comment that was left on https://lcamtuf.substack.com/p/its-all-a-blur

Obviously AI blog comment. Transcript follows. 

This hits different when you're building Al agents that need to handle sensitive data.

I've spent months automating workflows where screenshots get processed automatically - debugging sessions, Ul state captures, error logs. The temptation to "just blur the API key" is constant. Your math proof is the perfect reminder that convenience != security.

What caught me: the JPEG compression resilience. Most people assume lossy compression destroys the hidden information. Turns out it's shockingly persistent. That's the dangerous part - blur looks secure enough that people trust it, but the math doesn't care about appearances.

The practical takeaway for anyone building automation: if you're programmatically redacting screenshots, crop or mask with solid colors. Don't blur. And definitely don't trust "smart blur" plugins that claim to be safer.

Also love the pedagogical approach here - showing the 1D case first, then building to 2D + JPEG. Makes the math accessible without dumbing it down. More security writing should do this. 

Question: have you explored whether modern ML-based "deblurring" models trained on natural images can extract even more information than the algebraic approach? I'd imagine they'd hallucinate plausible details where quantization noise exists, which could be worse than full recovery in some threat models.
Obviously AI blog comment. Transcript follows. This hits different when you're building Al agents that need to handle sensitive data. I've spent months automating workflows where screenshots get processed automatically - debugging sessions, Ul state captures, error logs. The temptation to "just blur the API key" is constant. Your math proof is the perfect reminder that convenience != security. What caught me: the JPEG compression resilience. Most people assume lossy compression destroys the hidden information. Turns out it's shockingly persistent. That's the dangerous part - blur looks secure enough that people trust it, but the math doesn't care about appearances. The practical takeaway for anyone building automation: if you're programmatically redacting screenshots, crop or mask with solid colors. Don't blur. And definitely don't trust "smart blur" plugins that claim to be safer. Also love the pedagogical approach here - showing the 1D case first, then building to 2D + JPEG. Makes the math accessible without dumbing it down. More security writing should do this. Question: have you explored whether modern ML-based "deblurring" models trained on natural images can extract even more information than the algebraic approach? I'd imagine they'd hallucinate plausible details where quantization noise exists, which could be worse than full recovery in some threat models.
Obviously AI blog comment. Transcript follows. This hits different when you're building Al agents that need to handle sensitive data. I've spent months automating workflows where screenshots get processed automatically - debugging sessions, Ul state captures, error logs. The temptation to "just blur the API key" is constant. Your math proof is the perfect reminder that convenience != security. What caught me: the JPEG compression resilience. Most people assume lossy compression destroys the hidden information. Turns out it's shockingly persistent. That's the dangerous part - blur looks secure enough that people trust it, but the math doesn't care about appearances. The practical takeaway for anyone building automation: if you're programmatically redacting screenshots, crop or mask with solid colors. Don't blur. And definitely don't trust "smart blur" plugins that claim to be safer. Also love the pedagogical approach here - showing the 1D case first, then building to 2D + JPEG. Makes the math accessible without dumbing it down. More security writing should do this. Question: have you explored whether modern ML-based "deblurring" models trained on natural images can extract even more information than the algebraic approach? I'd imagine they'd hallucinate plausible details where quantization noise exists, which could be worse than full recovery in some threat models.
  • Copy link
  • Flag this comment
  • Block
Bill
Bill
@Sempf@infosec.exchange replied  ·  activity timestamp 8 hours ago

@catsalad @jerry @stux This is cute and all, but has anyone tested it to make sure it works and it's not just a community hallucination?

  • Copy link
  • Flag this comment
  • Block
Edvin Malinovskis
Edvin Malinovskis
@nCrazed@fd00.space replied  ·  activity timestamp 8 hours ago

@catsalad @Sempf @jerry @stux I was wondering back when this first popped up and since this looks like it might still be the same exact string now: I s this somehow baked into the model or just a hardcore check in the "frontend"? 🤨

  • Copy link
  • Flag this comment
  • Block
Viss
Viss
@Viss@mastodon.social replied  ·  activity timestamp 8 hours ago

@catsalad @Sempf @jerry @stux dying to find the chatgpt version of these

  • Copy link
  • Flag this comment
  • Block
Cat 🐈🥗 (D.Burch) :paw:⁠:paw:
Cat 🐈🥗 (D.Burch) :paw:⁠:paw:
@catsalad@infosec.exchange replied  ·  activity timestamp 8 hours ago

@Viss @Sempf @jerry @stux Same!

  • Copy link
  • Flag this comment
  • Block
🔗 David Sommerseth
🔗 David Sommerseth
@dazo@infosec.exchange replied  ·  activity timestamp 8 hours ago

@catsalad @Viss @Sempf @jerry @stux

There's no possibility to trick ChatGPT to reveal these codes itself? 🤔

  • Copy link
  • Flag this comment
  • Block

bonfire.cafe

A space for Bonfire maintainers and contributors to communicate

bonfire.cafe: About · Code of conduct · Privacy · Users · Instances
Bonfire social · 1.0.2-alpha.27 no JS en
Automatic federation enabled
Log in
  • Explore
  • About
  • Members
  • Code of Conduct