#Anubis denied more than 500_000 ai-* requests this month alone and put out more than 2,5 million challenges.
And the best: It just works.
#Anubis denied more than 500_000 ai-* requests this month alone and put out more than 2,5 million challenges.
And the best: It just works.
#Anubis denied more than 500_000 ai-* requests this month alone and put out more than 2,5 million challenges.
And the best: It just works.
Browser.Pub and Anubis
Hey @js@podcastindex.social, I enabled Anubis for my site, but now requests from Browser.pub don't work 
Is there something I can write a rule against, perhaps a header or user agent string that is unique to browser.pub?
This week on #OpenSourceSecurity I have a chat with @cadey about #Anubis, the tool that stops web AI scrapers
The scale of web scraping is way worse than I expected, and blocking things is also a lot harder than I expected
This is one of those conversations where I learned how little I know
This week on #OpenSourceSecurity I have a chat with @cadey about #Anubis, the tool that stops web AI scrapers
The scale of web scraping is way worse than I expected, and blocking things is also a lot harder than I expected
This is one of those conversations where I learned how little I know
The effect of putting the #postmarketOS wiki behind #Anubis 😞
I really hope whatever entities are doing this will run out of money soon...
trying #anubis - while it looks like claude was stopped many times, it somehow still managed to request and download tons of forgejo archives - this used the example that challenged for useragents that include mozilla. i am denying now but at the same time i am confused
(this pretty much made forgejo fill up something around 1.5T of disk space this day)
The effect of putting the #postmarketOS wiki behind #Anubis 😞
I really hope whatever entities are doing this will run out of money soon...
wanderer is now the first tchncs thing with #anubis in front of it (partially) 👀
if (apart from sporadic timeouts - which hopefully now happen much less if at all) something seems broken - such as federation - please dm me. however so far, things luckily seem fine.
its still somewhat experimental. just as with claude yesterday, today i had to deny amazonbot by hand which i dont think should be required...
dashboard source: https://github.com/TecharoHQ/anubis/discussions/653#discussioncomment-13469606
trying #anubis - while it looks like claude was stopped many times, it somehow still managed to request and download tons of forgejo archives - this used the example that challenged for useragents that include mozilla. i am denying now but at the same time i am confused
(this pretty much made forgejo fill up something around 1.5T of disk space this day)
[Annonce de service]
Depuis ce matin, notre service Git (git.lacontrevoie.fr) subit une vague importante de trafic causé par des bots.
Nous utilisons depuis deux mois le logiciel #Anubis pour nous protéger de ce trafic illégitime, mais ces bots semblent désormais être en mesure de résoudre son mécanisme de protection.
Nous venons de le configurer pour augmenter la difficulté des challenges, ce qui pourrait causer un délai lors de votre accès à nos services.
Désolé·es du dérangement !
[Annonce de service]
Depuis ce matin, notre service Git (git.lacontrevoie.fr) subit une vague importante de trafic causé par des bots.
Nous utilisons depuis deux mois le logiciel #Anubis pour nous protéger de ce trafic illégitime, mais ces bots semblent désormais être en mesure de résoudre son mécanisme de protection.
Nous venons de le configurer pour augmenter la difficulté des challenges, ce qui pourrait causer un délai lors de votre accès à nos services.
Désolé·es du dérangement !
Bien, un con de bot a mis mon serveur Web à genoux quelques minutes cette nuit en se rendant dans des lieux interdit aux robots (il a pas chopé robots.txt ce saligot). Du coup, je regarde la doc d' #Anubis.
You Don't Need Anubis
#HackerNews #You #Need #Anubis #hackernews #techblog #softwaredevelopment #programming #insights
Just updated Anubis to v1.23.0 and it looks like something has changed 
Looking at the timelines from Tusky or Tuba, the timelines would not update. However the web UI of my GoToSocial instance still works fine. Checked the Anubis log and it looked like my Anubis was sending challenges to Tusky, Tuba and even to other fedi servers (Mastodon etc) 
So anyway I fixed it 🤞 by creating new configuration file: /usr/local/share/doc/anubis/data/apps/allow-api-routes.yaml:
- name: allow-api-routes
action: ALLOW
expression:
all:
- '!(method == "HEAD")'
- path.startsWith("/api/")
And another one: /usr/local/share/doc/anubis/data/apps/allow-user-routes.yaml:
- name: allow-api-routes
action: ALLOW
expression:
all:
- '!(method == "HEAD")'
- path.startsWith("/users/")
And then adding the new configurations to /etc/anubis/<gotosocial_service_name>.botPolicies.yaml:
bots:
...
- import: /usr/local/share/doc/anubis/data/apps/allow-api-routes.yaml
- import: /usr/local/share/doc/anubis/data/apps/allow-user-routes.yaml
...
Restarted the Anubis service, and it works again after that. Dunno if that that is even the correct way to do it though. Hopefully i haven't weakened anything 
OK here's a theory: #ChatGPT's #Atlas browser is not a really browser but fact a way for OpenAI to circumvent scrape blockers. It's more a distributed human-based scraper rather than anything else.
Given how widely loathed AI and how damaging AI scrapers have become #OpenAI's IP ranges ended up in quite a lot of block lists, many servers outright terminate any connection to them. Then there are things like #Anubis or #Iocaine that further frustrate #LLM scraping.
But what if you DIDN'T neeed to bother about all that? What if you could use civilian IP addresses with "organic" traffic patterns, and have humans solve Captchas, provide proof of work for Anubis, or get around Iocaine? All this for free -- you don't even need to pay people for it?
I would be REALLY interested to see what telemetry Atlas sends back. 100% certain it will send back things like URL and rendered HTML output, possibly user interaction patterns ("a normal human on this website moves their mouse first to the 'I am not a bot' captcha then clicks it). They do not have to respect robots.txt because, well, it comes from organic visitors...
Am I crazy?