Discussion
Loading...

Post

  • About
  • Code of conduct
  • Privacy
  • Users
  • Instances
  • About Bonfire
Sven Slootweg ("still kinky and horny anyway")
@joepie91@fedi.slightly.tech  ·  activity timestamp 3 hours ago

Reading https://sprocketfox.io/xssfox/2025/11/05/yacy/ and it seems to confirm something about "AI" scrapers that I've feared for a while now:

If you attempt to scrape websites using the YaCy user agent you are often left with disappointment. If you think you can just switch to a Googlebot user agent your left with being blocked by WAFs and CloudFlare for not coming from the right IP / AS number or other types of fingerprinting. Places like StackOverflow try very hard to not have their content scrapped as it would destroy their business model.

Today we have a new problem, AI scraping. YaCy practically appears no different to an AI scrapper when using a Googlebot user agent. The AI scraping shitstorm has effectively stopped another search engine crawler from existing.

I don't know that it's even possible anymore to start a new web search engine.

YaCy - The search engine I thought I would love

Build your own search engine in a day. Fall in love with the concept. Then realise it's all terrible.
  • Copy link
  • Flag this post
  • Block
Log in

bonfire.cafe

A space for Bonfire maintainers and contributors to communicate

bonfire.cafe: About · Code of conduct · Privacy · Users · Instances
Bonfire social · 1.0.0-rc.3.26 no JS en
Automatic federation enabled
  • Explore
  • About
  • Members
  • Code of Conduct
Home
Login