#Tag · bonfire.cafe

#Tag

Facebook's Fascination with My Robots.txt

Facebook's Fascination with My Robots.txt

How I protect my Forgejo instance from AI web crawlers

How I protect my forgejo instance from AI Web
Crawlers

How a web crawler is supposed to work:

1. Reads /robots.txt
2. Parses robots.txt and honors User-Agent | Allow / Disallow designations
3. Returns periodically to retrieve permitted content

How AI/LLM training crawlers work:

1. Crawls entire website
2. Reads /robots.txt
3. Returns 10 minutes later
4. GOTO 1.

How a web crawler is supposed to work:

1. Reads /robots.txt
2. Parses robots.txt and honors User-Agent | Allow / Disallow designations
3. Returns periodically to retrieve permitted content

How AI/LLM training crawlers work:

1. Crawls entire website
2. Reads /robots.txt
3. Returns 10 minutes later
4. GOTO 1.

bonfire.cafe

A space for Bonfire maintainers and contributors to communicate

Automatic federation enabled