Reading https://sprocketfox.io/xssfox/2025/11/05/yacy/ and it seems to confirm something about "AI" scrapers that I've feared for a while now:
If you attempt to scrape websites using the YaCy user agent you are often left with disappointment. If you think you can just switch to a Googlebot user agent your left with being blocked by WAFs and CloudFlare for not coming from the right IP / AS number or other types of fingerprinting. Places like StackOverflow try very hard to not have their content scrapped as it would destroy their business model.
Today we have a new problem, AI scraping. YaCy practically appears no different to an AI scrapper when using a Googlebot user agent. The AI scraping shitstorm has effectively stopped another search engine crawler from existing.
I don't know that it's even possible anymore to start a new web search engine.