How I protect my Forgejo instance from AI web crawlers
https://her.esy.fun/posts/0031-how-i-protect-my-forgejo-instance-from-ai-web-crawlers/index.html
#HackerNews #AIProtection #Forgejo #WebCrawlers #Cybersecurity #TechTips
How I protect my Forgejo instance from AI web crawlers
https://her.esy.fun/posts/0031-how-i-protect-my-forgejo-instance-from-ai-web-crawlers/index.html
#HackerNews #AIProtection #Forgejo #WebCrawlers #Cybersecurity #TechTips
How a web crawler is supposed to work:
1. Reads /robots.txt
2. Parses robots.txt and honors User-Agent | Allow / Disallow designations
3. Returns periodically to retrieve permitted content
How AI/LLM training crawlers work:
1. Crawls entire website
2. Reads /robots.txt
3. Returns 10 minutes later
4. GOTO 1.
How a web crawler is supposed to work:
1. Reads /robots.txt
2. Parses robots.txt and honors User-Agent | Allow / Disallow designations
3. Returns periodically to retrieve permitted content
How AI/LLM training crawlers work:
1. Crawls entire website
2. Reads /robots.txt
3. Returns 10 minutes later
4. GOTO 1.