After that, I will update Nam-Shub of Enki & nixocaine, and then I'll ponder some more about separating decisions from output generation.
Discussion
After that, I will update Nam-Shub of Enki & nixocaine, and then I'll ponder some more about separating decisions from output generation.
Another reason the previous ASN->string->AhoCorasick thing doesn't quite work is because the pattern matching is a partial match. Thankfully, there's a StringList
type I export to #roto, with a .contents()
method, so:
let asn = ASN
.as_asn_matcher()?
.lookup(request.header("x-forwarded-for"))
.to_string();
if BANNED_ASNS.contains(asn) {
return garbage("banned-asn");
}
...this will work correctly, though, it will be slower than an AhoCorasick match if the list is longer than about a dozen ASNs. And the string conversion is comparatively expensive.
For country-based lookups... hmm...
let country = COUNTRY
.as_country_matcher()?
.lookup(request.header("x-forwarded-for");
if BANNED_COUNTRIES.matches(country) {
return garbage("banned-country");
}
This'd work. We can use AhoCorasick here, because the .lookup()
function in this case will return the ISO code of the country, always two characters, and always unique, so there's no partial matching.
This is pretty much fine as it is. One has to decide whether to use a matcher, or a .contains()
on a string list (the latter requires less boilerplate; the difference in speed in small sets should be negligible though), but other than that, I don't think I have much to improve for this case.
Ideally, for both ASN and country matching, I'd like to write something like this:
if BANNED_ASNS.matches(request.header("x-forwarded-for")) {
return garbage("banned-asn");
}
As in, the matcher would be able to match against a set of ASNs/countries. For countries, this is doable, because I already have a StringList
type. For ASNs... I have nothing, and I'm not sure I'd like to introduce a U32List
type.
"Why not make the matcher mutable, and push the ASNs one by one?"
Because then I still have the same problem! The only loop in roto
is a while
loop, and the list of ASNs is coming from a configuration file, and... yeah. Need some kind of list there.
My current thinking is that I'll abuse StringList
, and require the ASNs to be given as strings (without the AS
prefix), and I'll internally convert to u32. No need for a new type, and the conversion is done at init time, so if it's more expensive than an U32List
would be, it's still ok, and I didn't need to introduce Yet Another list-y type.
But.... if the ASN/Country matchers match against a known set of ASNs/Countries, what about .lookup(addr)
(which currently returns the ASN/Country for addr
) and .within(addr, ASN/Country)
(which returns a bool
)?
I believe these are still useful methods. Perhaps less so for Nam-Shub of Enki, but other request handlers may wish to do things differently.
Luckily, they can still work! The matcher will internally store the database, and the country/asn list separately, and for .lookup()
and .within()
, it will simply not use the list, only the db.
Another advantage of this is that I will not need the .as_asn_matcher()?
part! Because these matchers will now implement .matches()
, I can dispatch automatically.
A space for Bonfire maintainers and contributors to communicate