In completely unrelated news, I have a half-finished branch that can render frames from Doom. That branch will never be part of iocaine, but if I end up making a plugin system, then it will be a supported plugin.

Why? "Because a game engine is anything that can execute Lua scripts" (context), and iocaine can execute Lua scripts, and as such, is a game engine, and a test of a game engine is to see if it can be used to run Doom. iocaine can run¹ Doom.

¹: For some limited values of "run".

#iocaine

After that, I will update Nam-Shub of Enki & nixocaine, and then I'll ponder some more about separating decisions from output generation.

#iocaine

Current state is up on the where-in-the-world-is-this-slop-machine branch.

I'll clean up the commits later, and will see if I can improve the implementation and/or API, then I'll merge it to main.

#iocaine

But.... if the ASN/Country matchers match against a known set of ASNs/Countries, what about .lookup(addr) (which currently returns the ASN/Country for addr) and .within(addr, ASN/Country) (which returns a bool)?

I believe these are still useful methods. Perhaps less so for Nam-Shub of Enki, but other request handlers may wish to do things differently.

Luckily, they can still work! The matcher will internally store the database, and the country/asn list separately, and for .lookup() and .within(), it will simply not use the list, only the db.

#iocaine

Another advantage of this is that I will not need the .as_asn_matcher()? part! Because these matchers will now implement .matches(), I can dispatch automatically.

#iocaine

But.... if the ASN/Country matchers match against a known set of ASNs/Countries, what about .lookup(addr) (which currently returns the ASN/Country for addr) and .within(addr, ASN/Country) (which returns a bool)?

I believe these are still useful methods. Perhaps less so for Nam-Shub of Enki, but other request handlers may wish to do things differently.

Luckily, they can still work! The matcher will internally store the database, and the country/asn list separately, and for .lookup() and .within(), it will simply not use the list, only the db.

#iocaine

Ideally, for both ASN and country matching, I'd like to write something like this:

if BANNED_ASNS.matches(request.header("x-forwarded-for")) {
return garbage("banned-asn");
}

As in, the matcher would be able to match against a set of ASNs/countries. For countries, this is doable, because I already have a StringList type. For ASNs... I have nothing, and I'm not sure I'd like to introduce a U32List type.

"Why not make the matcher mutable, and push the ASNs one by one?"

Because then I still have the same problem! The only loop in roto is a while loop, and the list of ASNs is coming from a configuration file, and... yeah. Need some kind of list there.

#iocaine

My current thinking is that I'll abuse StringList, and require the ASNs to be given as strings (without the AS prefix), and I'll internally convert to u32. No need for a new type, and the conversion is done at init time, so if it's more expensive than an U32List would be, it's still ok, and I didn't need to introduce Yet Another list-y type.

#iocaine

For country-based lookups... hmm...

let country = COUNTRY
  .as_country_matcher()?
  .lookup(request.header("x-forwarded-for");
if BANNED_COUNTRIES.matches(country) {
  return garbage("banned-country");
}

This'd work. We can use AhoCorasick here, because the .lookup() function in this case will return the ISO code of the country, always two characters, and always unique, so there's no partial matching.

This is pretty much fine as it is. One has to decide whether to use a matcher, or a .contains() on a string list (the latter requires less boilerplate; the difference in speed in small sets should be negligible though), but other than that, I don't think I have much to improve for this case.

#iocaine

Ideally, for both ASN and country matching, I'd like to write something like this:

if BANNED_ASNS.matches(request.header("x-forwarded-for")) {
return garbage("banned-asn");
}

As in, the matcher would be able to match against a set of ASNs/countries. For countries, this is doable, because I already have a StringList type. For ASNs... I have nothing, and I'm not sure I'd like to introduce a U32List type.

"Why not make the matcher mutable, and push the ASNs one by one?"

Because then I still have the same problem! The only loop in roto is a while loop, and the list of ASNs is coming from a configuration file, and... yeah. Need some kind of list there.

#iocaine

Another reason the previous ASN->string->AhoCorasick thing doesn't quite work is because the pattern matching is a partial match. Thankfully, there's a StringList type I export to #roto, with a .contents() method, so:

let asn = ASN
  .as_asn_matcher()?
  .lookup(request.header("x-forwarded-for"))
  .to_string();
if BANNED_ASNS.contains(asn) {
  return garbage("banned-asn");
}

...this will work correctly, though, it will be slower than an AhoCorasick match if the list is longer than about a dozen ASNs. And the string conversion is comparatively expensive.

#iocaine

For country-based lookups... hmm...

let country = COUNTRY
  .as_country_matcher()?
  .lookup(request.header("x-forwarded-for");
if BANNED_COUNTRIES.matches(country) {
  return garbage("banned-country");
}

This'd work. We can use AhoCorasick here, because the .lookup() function in this case will return the ISO code of the country, always two characters, and always unique, so there's no partial matching.

This is pretty much fine as it is. One has to decide whether to use a matcher, or a .contains() on a string list (the latter requires less boilerplate; the difference in speed in small sets should be negligible though), but other than that, I don't think I have much to improve for this case.

#iocaine

let asn = ASN
.as_asn_matcher()?
.lookup(request.header("x-forwarded-for"))
.to_string();
if BANNED_ASNS.matches(asn) {
return garbage("banned-asn");
}

Not ideal, due to having to convert an u32 to a string, and then match that string against a pattern (with AhoCorasick), but it gets the job done. As a first approximation, this is okay-ish, but I'll be iterating on this a bit more.

#iocaine

Another reason the previous ASN->string->AhoCorasick thing doesn't quite work is because the pattern matching is a partial match. Thankfully, there's a StringList type I export to #roto, with a .contents() method, so:

let asn = ASN
  .as_asn_matcher()?
  .lookup(request.header("x-forwarded-for"))
  .to_string();
if BANNED_ASNS.contains(asn) {
  return garbage("banned-asn");
}

...this will work correctly, though, it will be slower than an AhoCorasick match if the list is longer than about a dozen ASNs. And the string conversion is comparatively expensive.

#iocaine

let asn = ASN
.as_asn_matcher()?
.lookup(request.header("x-forwarded-for"))
.to_string();
if BANNED_ASNS.matches(asn) {
return garbage("banned-asn");
}

Not ideal, due to having to convert an u32 to a string, and then match that string against a pattern (with AhoCorasick), but it gets the job done. As a first approximation, this is okay-ish, but I'll be iterating on this a bit more.

#iocaine

I know #iocaine doesn't have a fully fledged howto for using #nginx as the reverse proxy, but I have a lot in my nginx config currently, so I want to try and get it working there
After figuring out that the different configuration pages don't agree on what socket path for the client connections to iocaine, I now have the 421 error being returned to the browser, but I don't understand what I need to fix to get to a working set up
I have no log outputs when accessing blog.cerberos.id.au
#askFedi

upstream iocaine {
  server unix://run/iocaine/iocaine.socket;
}

server {
    listen       443 ssl;
    server_name  blog.cerberos.id.au;
    add_header Strict-Transport-Security "max-age=31536000; includeSubDomains" always;

    ssl_protocols        TLSv1.2;

    ssl_session_cache    shared:SSL:1m;
    ssl_session_timeout  5m;

    ssl_ciphers  HIGH:!aNULL:!MD5;
    ssl_prefer_server_ciphers  on;

        location / {
                proxy_pass http://iocaine;
                proxy_cache off;
                proxy_intercept_errors on;
                error_page 421 = @fallback;
        }

    location @fallback {
        root   /data/blog;
        internal;
    }
    
    ssl_certificate /etc/letsencrypt/live/blog.cerberos.id.au/fullchain.pem; # managed by Certbot
    ssl_certificate_key /etc/letsencrypt/live/blog.cerberos.id.au/privkey.pem; # managed by Certbot

    #debug logging options
    error_log /var/log/nginx/blog.error.log warn;
    access_log /var/log/nginx.access.log;
}
upstream iocaine { server unix://run/iocaine/iocaine.socket; } server { listen 443 ssl; server_name blog.cerberos.id.au; add_header Strict-Transport-Security "max-age=31536000; includeSubDomains" always; ssl_protocols TLSv1.2; ssl_session_cache shared:SSL:1m; ssl_session_timeout 5m; ssl_ciphers HIGH:!aNULL:!MD5; ssl_prefer_server_ciphers on; location / { proxy_pass http://iocaine; proxy_cache off; proxy_intercept_errors on; error_page 421 = @fallback; } location @fallback { root /data/blog; internal; } ssl_certificate /etc/letsencrypt/live/blog.cerberos.id.au/fullchain.pem; # managed by Certbot ssl_certificate_key /etc/letsencrypt/live/blog.cerberos.id.au/privkey.pem; # managed by Certbot #debug logging options error_log /var/log/nginx/blog.error.log warn; access_log /var/log/nginx.access.log; }

Crazy thought: what if I added another request handler language to #iocaine? What if it was #Rust?

Not quite sure how it would work, but the request handler would be a crate-type = ["cdylib"] thing. That might be even more performant than Roto, and would allow doing a whole lot of things neither the Roto nor the Lua/Fennel engines can (because they're intentionally locked down, while due to the nature of being a cdylib, Rust wouldn't be).

#iocaine has been up for 14m 48s, and spent 8m 8s dealing with - gestures hands wildly - everything.

In the past 24 hours, it served 31.53M requests, 97.27% of which were garbage, 2.71% passed through unscathed, and 0.005% were fed to the Cookie Monster. This required about 116.21MiB of memory on average, and 71.09GiB of absolute trash was generated for the nastiest visitors.

Top garbage consumers were:

  1. Disguised bots - 23.00M
  2. Enthusiastic guestbook visitors - 2.08M
  3. Claude - 1.34M
  4. OpenAI - 706.76K
  5. Facebook - 398.74K
  6. Amazon - 279.96K
  7. Commercial scrapers - 215.17K
  8. Google - 1.59K

Various other agents slurped through 590.44K pages of unhinged junk, bless their little hearts.

In these trying times, 0.07% of all requests were likely of human origin: I hope you enjoyed your stay, and will visit again! Of all requests iocaine let into the garden, 91.37% were from Fediverse software. Thank you! #FediHug

#AIStatsPorn

#iocaine has been up for 14m 48s, and spent 8m 8s dealing with - gestures hands wildly - everything.

In the past 24 hours, it served 31.53M requests, 97.27% of which were garbage, 2.71% passed through unscathed, and 0.005% were fed to the Cookie Monster. This required about 116.21MiB of memory on average, and 71.09GiB of absolute trash was generated for the nastiest visitors.

Top garbage consumers were:

  1. Disguised bots - 23.00M
  2. Enthusiastic guestbook visitors - 2.08M
  3. Claude - 1.34M
  4. OpenAI - 706.76K
  5. Facebook - 398.74K
  6. Amazon - 279.96K
  7. Commercial scrapers - 215.17K
  8. Google - 1.59K

Various other agents slurped through 590.44K pages of unhinged junk, bless their little hearts.

In these trying times, 0.07% of all requests were likely of human origin: I hope you enjoyed your stay, and will visit again! Of all requests iocaine let into the garden, 91.37% were from Fediverse software. Thank you! #FediHug

#AIStatsPorn

DailyMetrics {
resources: ResourceMetrics {
uptime: "18h 16m 20s",
cpu_time: "3h 30m 34s",
memory_used: "94.03MiB",
},
dashboard_url: "",
overview: OverviewMetrics {
total_requests: "17.46M",
garbage_generated: "34.58GiB",
breakdown: Breakdown {
garbage_percent: "86.93%",
reject_percent: "13.04%",
challenge_percent: "0.009%",
human_percent: "0.17%",
fedi_percent: "96.89%",
},
},
}

So close! Just have to format these into a toot template, and we're almost done.

#iocaine has been up for 19h 7m 36s, and spent 3h 57m 17s dealing with - gestures hands wildly - everything.

In the past 24 hours, it served 19.43M requests, 88.20% of which were garbage, 11.77% passed through unscathed, and 0.009% were fed to the Cookie Monster. This required about 95.20MiB of memory on average, and 39.17GiB of absolute trash was generated for the nastiest visitors.

Top garbage consumers were:

  1. Disguised bots - 11.67M
  2. Claude - 1.51M
  3. Enthusiastic guestbook visitors - 933.63K
  4. OpenAI - 737.35K
  5. Other - 429.48K
  6. Facebook - 351.21K
  7. Amazon - 279.15K
  8. Commercial scrapers - 269.35K
  9. Bots hitting generated URLs - 15.24K
  10. Google - 4.38K

In these trying times, 0.16% of all requests were likely of human origin: I hope you enjoyed your stay, and will visit again! Of all requests iocaine let into the garden, 96.85% were from Fediverse software. Thank you! #FediHug

#AIStatsPorn

With #iocaine 3.0, where the request handler is mandatory, I kept thinking how to make nixocaine and nam-shub-of-enki play well together. I came up with funky schemes and many nix crimes.

Last night, just as I was going to bed, I realized I don't need any of that. Since nixocaine is alredy a separate thing, and does not build on anything but the package provided by iocaine, I can simply make it use nam-shub-of-enki as an input too, and rather than having a separate NSoE module that integrates with nixocaine, it would just all be in nixocaine.

#iocaine has been up for 11d 12h 45min, and spent 1d 16h 7min dealing with - gestures hands wildly - everything.

In the past 24 hours, it served 12.10M requests, 98.82% of which were garbage, 1.18% passed through unscathed, and 0.01% were fed to the Cookie Monster. This required about 104.97MiB of memory on average, and 34.57GiB of absolute trash was served to the nastiest visitors.

Top three garbage consumers were:

  1. Bots trying to hide (and failing) - 8.31M
  2. ClaudeBot - 1.74M
  3. GPTBot - 814.69k

In these trying times, 0.11% of all requests were likely of human origin: I hope you enjoyed your stay, and will visit again! Of all requests iocaine let into the garden, 69.25% were from Fediverse software. Thank you! #FediHug

#AIStatsPorn

alcinnz
fn algernon() -> impl Future
alcinnz and 1 other boosted
#iocaine has been up for 7d 12h 45min, and spent 22h 59min dealing with - gestures hands wildly - everything.

In the past 24 hours, it served 10.66M requests, 98.25% of which were garbage, 1.73% passed through unscathed, and 0.03% were fed to the Cookie Monster. This required about 92.67MiB of memory on average, and 30.96GiB of absolute trash was served to the nastiest visitors.

Top three garbage consumers were:

  1. Bots trying to hide (and failing) - 7.56M
  2. ClaudeBot - 925.86k
  3. GPTBot - 799.15k

In these trying times, 0.14% of all requests were likely of human origin: I hope you enjoyed your stay, and will visit again! Of all requests iocaine let into the garden, 71.05% were from Fediverse software. Thank you! #FediHug

#AIStatsPorn