Here's the thing about these crawlers, and I'm gonna make it bold, because it is important: you cannot make them go away.
They have more money than you do. They have more bandwidth than you do. They have more resources than you do. They have more everything than you do1. They do not fucking care.
Anubis won't make them go away. iocaine won't make them go away. go-away won't make them go away. Nepenthes won't make them go away. None of those will. They may block access to the real contents, they may reduce the impact the crawlers have, but none of them make them go away.
Serving them an empty 401 won't make them go away, either. That reduces your outgoing traffic significantly, but the bots will keep coming back, because they - I repeat - do not fucking care.
If you want to make them go away, you block them on the firewall. That's not always possible, but in case of AWS Singapure, when you don't expect legit traffic from there, blocking the entire ASN is an option.
If you can't firewall them off, the best you can do is mitigate. You cannot make them go away. You can make it more expensive for them, and help the bubble burst faster by serving them garbage. But that doesn't come for free, and if you're having traffic problems, this ain't an option either.
...until the bubble bursts ↩︎
@algernon And if it's THAT crawler I'm thinking of, even blocking IPs is exceptionally difficult, because it cycles through an absolutely massive amount of IPv4 addresses, using each address only once or twice in a month.
You have to drop all traffic from at least three different ASNs.