Discussion
Loading...

Post

  • About
  • Code of conduct
  • Privacy
  • Users
  • Instances
  • About Bonfire
EK :a_openbsd:
@rqm@exquisite.social  ·  activity timestamp 4 days ago

OK here's a theory: #ChatGPT's #Atlas browser is not a really browser but fact a way for OpenAI to circumvent scrape blockers. It's more a distributed human-based scraper rather than anything else.

Given how widely loathed AI and how damaging AI scrapers have become #OpenAI's IP ranges ended up in quite a lot of block lists, many servers outright terminate any connection to them. Then there are things like #Anubis or #Iocaine that further frustrate #LLM scraping.

But what if you DIDN'T neeed to bother about all that? What if you could use civilian IP addresses with "organic" traffic patterns, and have humans solve Captchas, provide proof of work for Anubis, or get around Iocaine? All this for free -- you don't even need to pay people for it?

I would be REALLY interested to see what telemetry Atlas sends back. 100% certain it will send back things like URL and rendered HTML output, possibly user interaction patterns ("a normal human on this website moves their mouse first to the 'I am not a bot' captcha then clicks it). They do not have to respect robots.txt because, well, it comes from organic visitors...

Am I crazy?

  • Copy link
  • Flag this post
  • Block
algernon pretending to be asleep
@algernon@come-from.mad-scientist.club replied  ·  activity timestamp 3 days ago

@rqm FWIW, while it sounds bad (and it is), iocaine will likely be able to trap Atlas, soon after there's a version of it I can experiment with myself (a Windows version will likely work - I can run that in a VM, didn't have much success with macOS VMs so far).

Once I can confirm that the method works outside of the short experiment I was able to conduct, I'll make it public1, so other tools can follow along and send Atlas to the garbage bin where it belongs. It will not be able to run rampant for long.


  1. No, unfortunately I can't make the PoC public, much like a lot of iocaine experiments, it's developed for a client of mine, and I can only publish it when it is no longer experimental. ↩︎

  • Copy link
  • Flag this comment
  • Block
Log in

bonfire.cafe

A space for Bonfire maintainers and contributors to communicate

bonfire.cafe: About · Code of conduct · Privacy · Users · Instances
Bonfire social · 1.0.0-rc.3.21 no JS en
Automatic federation enabled
  • Explore
  • About
  • Members
  • Code of Conduct
Home
Login