Isn't that something? This graph shows traffic to pages on @heiseonline that don't exist (404 #error).
It seems like #AI really is sending much more traffic to pages that aren't there.
Is anyone else seeing this?
Isn't that something? This graph shows traffic to pages on @heiseonline that don't exist (404 #error).
It seems like #AI really is sending much more traffic to pages that aren't there.
Is anyone else seeing this?
@mho @heiseonline Why no label on the vertical axis??
@mho @heiseonline Indeed, they are just running over all id’s it seems, usually each url from four ip addresses at the same time too. 🤮
I had to strengthen the caching on our 404-page to keep the noise down. #brainlessidiots
@mho seeing this as well, yep @heiseonline
@mho I can confirm that the AI scrapers do a lot of 404 requests for URLs I used to have. My understanding is that they have datasets consisting of URLs. So every time they train, they need to ingest the training set from the Internet since they don't want to keep local copies. The crawling and the ingesting is independent and it takes a long time (possibly forever) for datasets to get updated.