Really excited by becoming a collaborator on #stegodon - It's such an exciting piece of software! I did however spend most of my morning countering #scrapers on lemmy.zip - the fun of hosting a site on the #fediverse eh 😊
We can't have nice things because of AI scrapers
https://blog.metabrainz.org/2025/12/11/we-cant-have-nice-things-because-of-ai-scrapers/
#HackerNews #AI #Scrapers #Technology #Ethics #Online #Community #Digital #Rights
It's been a few days since I posted about https://readily.news AKA "open.news", a service which:
1. asks for complete access to your Mastodon/fedi account
2. ingests whatever your account can see via your account and summarizes it using LLMs (seemingly from OpenAI?)
3. sends you a daily, personalized newsletter
It's a particularly bad kind of scraper because it basically hijacks existing community infra to do the scraping for it.
Because accounts' host instances are the actors gathering up all the content there's no way for remote servers to detect which of their followers' accounts have been compromised, nor to block their posts from ending up in the hands of the upstream LLM providers.
We'll probably need admins of affected instances to run a database query to detect and revoke permissions granted to this service via OAuth to limit its access.
I asked the guy who
the guy who appears to be behind it (https://mastodon.social/@librenews
) if he could confirm his affiliation, but he doesn't actually seem to be very active on Mastodon (preferring Bluesky) and so he still hasn't responded.
I'm actually a little surprised at how little reaction there's been to this based on how quickly other scrapers were run off the network, but I get that people are busy.
If you want more details, the specifics of my investigation are in this post:
https://cryptography.dog/blog/what-little-i-know-about-readily-news/
...and I'd appreciate if others could corroborate my findings.
Guarding My Git Forge Against AI Scrapers
https://vulpinecitrus.info/blog/guarding-git-forge-ai-scrapers/
#HackerNews #GuardingMyGitForge #AI #Scrapers #Cybersecurity #OpenSource #DeveloperCommunity
I just published a blog post summing up my most pertinent thoughts about dealing with badly-behaved web-scraping bots:
https://cryptography.dog/blog/AI-scrapers-request-commented-scripts/
It isn't exactly a Hallowe'en-themed article, but today is the 31st and the topic is concerned with pranking people who come knocking on my website's ports, so it's somewhat appropriate.
#infosec #bots #halloween #scrapers #AI #someMoreHashtagsHere
It's been a few days since I posted about https://readily.news AKA "open.news", a service which:
1. asks for complete access to your Mastodon/fedi account
2. ingests whatever your account can see via your account and summarizes it using LLMs (seemingly from OpenAI?)
3. sends you a daily, personalized newsletter
It's a particularly bad kind of scraper because it basically hijacks existing community infra to do the scraping for it.
Because accounts' host instances are the actors gathering up all the content there's no way for remote servers to detect which of their followers' accounts have been compromised, nor to block their posts from ending up in the hands of the upstream LLM providers.
We'll probably need admins of affected instances to run a database query to detect and revoke permissions granted to this service via OAuth to limit its access.
I asked the guy who
the guy who appears to be behind it (https://mastodon.social/@librenews
) if he could confirm his affiliation, but he doesn't actually seem to be very active on Mastodon (preferring Bluesky) and so he still hasn't responded.
I'm actually a little surprised at how little reaction there's been to this based on how quickly other scrapers were run off the network, but I get that people are busy.
If you want more details, the specifics of my investigation are in this post:
https://cryptography.dog/blog/what-little-i-know-about-readily-news/
...and I'd appreciate if others could corroborate my findings.
does anyone know who's behind "open.news"?
> Open.News is the command center for the decentralized newsverse.
Looks like they're ingesting people's fediverse feeds into LLMs and feeding slop to people. I only noticed because it was mostly visiting non-existent or malformed URLs.
> We index live conversations across RSS, Bluesky, and Mastodon so you never miss the story behind the story. FeedBrainer's conversational AI transforms the firehose into a calm, contextual briefing tailored to you.
@Codeberg my problem ain't #scrapers - otherwise I'd #SelfHost my stuff in my own home LAN - but rather #bots flooding #Issues and #PullRequests with garbage.
The problem I dread is once people start abusing their " #AI" #bullshit and #FloodTheZoneWithShit aka. " #AIslop" for no good reason.
- Kinda like @bagder had to deal with "AI" slop #SecurityReports that didn't even try to show #ProofOfConcept or actually evidence their claims in a scientifically reproduceable fashion but merely wasted lifetime of maintainers!
And @Erpel 's original issue is just that: #spam in the #IssueTracker...
does anyone know who's behind "open.news"?
> Open.News is the command center for the decentralized newsverse.
Looks like they're ingesting people's fediverse feeds into LLMs and feeding slop to people. I only noticed because it was mostly visiting non-existent or malformed URLs.
> We index live conversations across RSS, Bluesky, and Mastodon so you never miss the story behind the story. FeedBrainer's conversational AI transforms the firehose into a calm, contextual briefing tailored to you.
AI scrapers request commented scripts
https://cryptography.dog/blog/AI-scrapers-request-commented-scripts/
#HackerNews #AI #scrapers #commented #scripts #technology #automation
Since so many people are boosting this thread I think I'll take the opportunity to mention that I'm available for hire on a part-time or contract basis.
Feel free to reach out if you like my ideas about computer-related topics and have both the budget and need of someone who has such ideas.
I can be reached by Fediverse DM or the contact form on my website:
I just published a blog post summing up my most pertinent thoughts about dealing with badly-behaved web-scraping bots:
https://cryptography.dog/blog/AI-scrapers-request-commented-scripts/
It isn't exactly a Hallowe'en-themed article, but today is the 31st and the topic is concerned with pranking people who come knocking on my website's ports, so it's somewhat appropriate.
#infosec #bots #halloween #scrapers #AI #someMoreHashtagsHere