Discussion · bonfire.cafe

@hipsterelectron@circumstances.run · 15 hours ago

@SRAZKVT you can use my fractal zip format for this purpose actually (one of the design goals was representing a dataset with changes over time). that's actually a really fascinating corollary of using it to represent atomic transitions between reproducible filesystem states....................i did not realize that VCS is precisely a formulation of atomic filesystem transactions. need to think about this further

0

@hipsterelectron@circumstances.run · 11 hours ago

in fact, this ends up being very similar to how tar or zip archives work, because these were intended as generic serialization formats for filesystem trees!

making these unambiguous invokes a whole separate discussion around what a checksum ends up "proving". but i do not have time to get into tree hashing of zip archives at this time or hash collision attacks on .tar.zst files--for now, we assume:

checksums don't collide,
the filesystem can be globally write-locked while recursively iterating,
we can "stop" and "restart" the kernel build process at arbitrary points

@hipsterelectron@circumstances.run · 11 hours ago

(in fact, these are all things the linux operating system kernel should be able to let us do with a filesystem tree, but that's yet another topic for a separate time)

@hipsterelectron@circumstances.run · 11 hours ago

so: we can take a filesystem tree like the kernel build directory, and we can convert that to a checksum. the "reproducible builds" organization (despite lack of OS-level support for atomicity) defines "reproducibility" in these terms:

if the kernel build directory starts at checksum C_0,
and after the build process, the kernel build directory matches the expected checksum C_1,

then the build process is "reproducible". it produces C_1 from C_0 (after executing some vaguely-defined process or processes). this is repeated if there is sufficient information available to produce C_1 from C_0

@hipsterelectron@circumstances.run · 11 hours ago

i say "if there is sufficient information", but the reproducible builds evangelism strike force team doesn't think that way. their modus operandi is to repeatedly neg maintainers to make extremely confusing and subtle modifications to their release process for all their users.

it would only be necessary to change the build process for all users if the reproducible builds squadron believed very very deeply that the maintainer's build output is the ground truth for everyone else to reproduce.

.........which brings us to the issue at hand for the kernel.

@hipsterelectron@circumstances.run · 11 hours ago

for a representative example of how the reproducible builds evangelism strike force approaches maintainers, consider this representative example, where an arch linux package maintainer posts to the bug report mailiing list for gnu automake https://lists.gnu.org/archive/html/bug-automake/2025-08/msg00000.html

In Arch Linux our automake package includes /usr/share/doc/automake/amhello-1.0.tar.gz. When we rebuild this package using our rebuilder to check for reproduciblity the uid/gid and timestamps are not normalized

the arch linux package build system orchestrates the build process,
the arch linux automake package decides to include extraneous test data in the output,
the arch linux 'reproducibility" checker does not automatically zero out fields that are known to induce non-matching checksums,

.........so arch linux files a "bug" against automake.

bug#79170: Please make amhello-1.0.tar.gz reproducible

@hipsterelectron@circumstances.run · 5 hours ago

to remove all doubt, another arch linux maintainer follows up: https://lists.gnu.org/archive/html/bug-automake/2025-11/msg00007.html

You don't need to worry about the value, this variable is meant to be set externally. From the reproducible-builds.org documentation, this is suggested for shell scripts on GNU systems:

(note the username kpcyrd here. he'll be coming up again soon.)

this is a very specific set of build process requirements specific to the arch linux packaging system, which our friendly neighborhood distro maintainer is able to specify with precise detail.

and this is filed as a bug upstream, because the reproducible builds evangelism strike force requires "reproducibility" in the form of a code injection API to achieve a chosen-plaintext attack.

bug#79170: Please make amhello-1.0.tar.gz reproducible

@hipsterelectron@circumstances.run · 5 hours ago

anyway, this guy's website is named "vulns" and his work is sponsored by google and the linux foundation https://vulns.xyz/2021/07/disagreeing-rebuilders/

[we will return to the kernel now. i promise this was necessary]

Disagreeing rebuilders and what that means - vulns.xyz

Today we’ve noticed a disagreement between the Arch Linux rebuilders about the “cross” package, a popular @rustlang cross-compile tool. One rebuilder reported they’ve succesfully reproduced the package, while the other reported they couldn’t. Let’s have a look what that means.

⁂

More from

@hipsterelectron@circumstances.run · 4 hours ago

let's refresh our memory on the module signature problem statement: https://lwn.net/Articles/1012946/

This mechanism, which checks module integrity based on hashes computed at build time instead of using cryptographic signatures

this is where we can finally start to describe why that justification makes absolutely no fucking sense!

Hash-based module integrity checking

On January 20, Thomas Weißschuh shared a new patch set implementing an alternate method for c [...]

@hipsterelectron@circumstances.run · 4 hours ago

a signature is essentially just the result of encrypting a checksum with a private key, so anyone can decrypt with the corresponding public key to obtain the checksum.

it's true that there are other ways with more constraints, but the standard methods like EdDSA quite literally accept an arbitrary cryptographic hash function (checksum) as a parameter.

EdDSA - Wikipedia

@hipsterelectron@circumstances.run · 4 hours ago

that checksum is in fact exactly the information we need for "reproducibility"! and we very much do want to ensure we have the exact same checksum as was generated by the upstream maintainer--that's what the public key verification achieves!

the cryptographic "proof" resulting from a private key-based checksum signature is very specifically the human assurance that the human owner of that private key must have generated the corresponding checksum!

@hipsterelectron@circumstances.run · 4 hours ago

if we accept that a cryptographic checksum is "proof" of reproducibility, then a signature scheme is strictly more powerful--proof of reproducible output, and proof that the output checksum was not modified after being generated by the holder of the private key!

let's take a look at that patch series now.

@hipsterelectron@circumstances.run · 4 hours ago

lwn complains about AI scrapers when you try to access their locally hosted copy of diffs from LKML. https://web.archive.org/web/20250409044448/https://lwn.net/ml/all/20250120-module-hashes-v2-2-ba1184e27b7f@weissschuh.net/

(why not just link to LKML if they're getting scraped so hard? if you ask questions like this you will not like the answers you find)

this diff [2/6 in the patchset] adds a new config option that disables the existing config option to enforce signature checking. real Kconfig heads will understand that this config dependency is equivalent to an override mechanism. so you can disable module signing even if the user config requires it.

that's not even the cryptographic part yet, just an extra build system backdoor. the cryptographic claims are next.

[PATCH v2 2/6] module: Make module loading policy usable without MODULE_SIG [LWN.net]

@hipsterelectron@circumstances.run · 4 hours ago

this is the magnum opus of the reproducible builds evangelism strike force: https://web.archive.org/web/20250408191140/https://lwn.net/ml/all/20250120-module-hashes-v2-6-ba1184e27b7f@weissschuh.net/. let's evaluate these claims:

The current signature-based module integrity checking has some drawbacks in combination with reproducible builds:

drawbacks in combination? that makes it sound like reproducible builds are a simple config setting. that would be nice, right? if reproducible builds had precise semantics? and they fucked off and stopped bothering everyone else?

@hipsterelectron@circumstances.run · 4 hours ago

Either the module signing key is generated at build time, which makes the build unreproducible,

we're going to examine this claim in more detail presently. but first we absolutely need to highlight the rest of this sentence:

or a static key is used, which precludes rebuilds by third parties and makes the whole build and packaging process much more complicated.

i cannot possibly express the violent feelings within me upon reading this statement:

a "static key" refers to "literally a normal key, the way it worked before".
does it "preclude rebuilds by third parties"? (we will evaluate this below.)
"makes the whole build and packaging process much more complicated" -- again, this is literally the way it works right now.

@hipsterelectron@circumstances.run · 4 hours ago

so the reproducible builds evangelism strike force get to whine about how complicated it is to make the build reproducible. if you hate your job then maybe choose a different line of work?

but [6/6] in this patchset has so much more to show us! here is the the reproducible build squadron's best and brightest, making things less complicated:

diff --git a/Documentation/kbuild/reproducible-builds.rst
b/Documentation/kbuild/reproducible-builds.rst
index f2dcc39044e66ddd165646e0b51ccb0209aca7dd..6a742ad745113a9267223b33810dbc7218c47d4c 100644
--- a/Documentation/kbuild/reproducible-builds.rst
+++ b/Documentation/kbuild/reproducible-builds.rst
@@ -79,7 +79,10 @@ generate a different temporary key for each build, resulting in the
 modules being unreproducible. However, including a signing key with
 your source would presumably defeat the purpose of signing modules.

-One approach to this is to divide up the build process so that the
+Instead ``CONFIG_MODULE_HASHES`` can be used to embed a static list
+of valid modules to load.
+
+Another approach to this is to divide up the build process so that the
 unreproducible parts can be treated as sources:

 1. Generate a persistent signing key. Add the certificate for the key

@hipsterelectron@circumstances.run · 3 hours ago

so, instead of forcing our brave and noble reproducible builds advocates to suffer the cruel and unusual punishment of "splitting up the build process", we now have the much less complex alternative of "adding a backdoor in Kconfig that short-circuits signature checking at runtime"

@hipsterelectron@circumstances.run · 3 hours ago

recall lwn's comment on this dastardly mathematical trickery:

It's tempting to search for a clever cryptographic solution, but nobody has yet proposed one.

i had earlier today developed a lengthy protocol that would have solved a much harder problem, but i've just realized i was still giving them far too much credit.

this isn't a "clever cryptographic solution". they're using deterministic fucking signatures. reproducing the build just means providing a private key to the build process, the method that is already supported for module signing.

@hipsterelectron@circumstances.run · 3 hours ago

it's right there in Documentation/admin-guide/module-signing.rst:

 (4) :menuselection:`File name or PKCS#11 URI of module signing key`
 (``CONFIG_MODULE_SIG_KEY``)

 Setting this option to something other than its default of
 ``certs/signing_key.pem`` will disable the autogeneration of signing keys
 and allow the kernel modules to be signed with a key of your choosing.

you can even sign off by hand:

To manually sign a module, use the scripts/sign-file tool available in
the Linux kernel source tree. The script requires 4 arguments:

 1. The hash algorithm (e.g., sha256)
 2. The private key filename or PKCS#11 URI
 3. The public key filename
 4. The kernel module to be signed

@hipsterelectron@circumstances.run · 3 hours ago

so reproducible builds advocates claim to understand cryptography enough to interpret a cryptographic checksum, but they don't seem to understand that signatures are strictly more powerful proofs than checksums, even though the module signing guide makes it quite clear that signatures are just checksums++.

so, what do they mean by "rebuilds", and what does the CONFIG_MODULE_HASHES backdoor let them achieve?

@hipsterelectron@circumstances.run · 3 hours ago

well, first let's recall the final word of warning from module-signing.rst:

Since the private key is used to sign modules, viruses and malware could use the private key to sign modules and compromise the operating system.

well, i wouldn't go so far as to call arch linux distro packagers "viruses and malware". that's a little harsh.

The private key must be either destroyed or moved to a secure location and not kept in the root node of the kernel source tree.

this is, of course, exactly what CONFIG_MODULE_HASHES does, enabling viruses and malware to sign modules and compromise the operating system.

@hipsterelectron@circumstances.run · 3 hours ago

but surely i'm mistaken. let's check lwn again:

If the signing keys are publicly available for use in recreating the build, malicious actors could also sign modified loadable modules with them.

that is indeed exactly what module-signing.rst warned us about!

If they aren't publicly available, the build can't be reproduced.

you see what this means right? "reproduced" doesn't mean "reproduced". "reproduced" means "malicious actors can also sign modified loadable modules".

@hipsterelectron@circumstances.run · 3 hours ago

why do i refer to CONFIG_MODULE_HASHES as a backdoor? because it overrides signature checking by coming before it:

diff --git a/kernel/module/main.c b/kernel/module/main.c
index effe1db02973d4f60ff6cbc0d3b5241a3576fa3e..094ace81d795711b56d12a2abc75ea35449c8300 100644
--- a/kernel/module/main.c
+++ b/kernel/module/main.c
@@ -3218,6 +3218,12 @@ static int module_integrity_check(struct load_info *info, int flags)
 {
 int err = 0;

+ if (IS_ENABLED(CONFIG_MODULE_HASHES)) {
+ err = module_hash_check(info, flags);
+ if (!err)
+ return 0;
+ }
+
 if (IS_ENABLED(CONFIG_MODULE_SIG))
 err = module_sig_check(info, flags);

@hipsterelectron@circumstances.run · 3 hours ago

this is all trying to dance around two very subtle points that require a very specialized technical understanding of cryptography to infer:

the signing keys are secret because downstream distro packagers and/or corporate sysadmins are the malicious actors which module signatures protect against!
more importantly, cryptographic signatures are just unspoofable checksums!

the fact that they're not "reproducible" is because they use secret data (the private key) to stop "malicious actors" from generating new checksums for "modified loadable modules"!

@hipsterelectron@circumstances.run · 3 hours ago

claiming a cryptographic signature is "nonreproducible" is a non sequiter--they are literally just a list of module checksums. it's the exact same fucking thing, except there is an additional cryptographic proof that modules haven't been modified since they left the custody of the key owner.

@hipsterelectron@circumstances.run · 3 hours ago

lwn, kpcyrd, Thomas Weißschuh, and everyone associated with the module hashing for "reproducibility" is either completely unaware of how cryptography works (and therefore should not be trusted with crypto), or they are lying in order to backdoor linux users (and therefore should not be trusted with crypto)

@hipsterelectron@circumstances.run · 3 hours ago

so here's the answer:

we build the kernel, then build the modules, then checksum the modules. this gets us checksums for the filesystem tree just before we introduce the secret data.
we verify the module checksums correspond to the ones produced by kernel maintainers by decrypting the published signatures with their public key. this is a stronger form of build reproducibility!

[in fact, this alone should be sufficient, because the kernel build process should be able to delay module signing until the very end. but for completeness, let's walk through how tree hashing lets us swap out a specific intermediate change and verify the result is correct.]

so our problem now can be decomposed into three stages:
(a) the filesystem state of the kernel build tree right before generating signatures can be checksummed in any way.
(b) adding module signatures is represented as a (normalized) filesystem delta (git can generate this).
(c) the result of the kernel build process continues until completion. generate a normalized delta for the filesystem state change from (b) to (c) with git diff.

the key insight here: unless signatures are copied into more than one place, the delta from (b) to (c) should not depend upon any secret data. so, reproducibility is ensured by:

matching the checksum of the filesystem state at point (a).
matching the module checksums against the checksums decrypted against maintainer public keys from the upstream signatures.
matching the checksum of the filesystem delta from (b) to (c)!

that actually still doesn't require any tree hashing either! but even if we absolutely cannot be assed to split the kernel build process into discrete a/b/c phases, or if the signature data from (b) influences the filesystem delta from (b) to (c) (e.g. if the signatures are copied into a text file), we can still make this shit 100000% reproducible, without any cryptography at all!

how? by simply erasing the signatures! take the upstream kernel build tree filesystem state, then replace any signature data with an equivalent length of zero bits (i.e. zero out the signatures). calculate the resulting checksum from upstream! do the same thing for downstream! YOUR CHECKSUMS WILL MATCH!

of course, this requires identifying the precise regions of data corresponding to signatures. but the kernel already knows this, because it has to read from those exact regions in order to validate module signatures upon load!

@hipsterelectron@circumstances.run · 3 hours ago

i believe the longer-term answer to "reproducible builds" involves OS-level support for filesystem checkpointing, per-process isolation of i/o state, and a deterministic ordering along with transactional semantics for propagating a series of i/o operations as an atomic filesystem delta.

which is to say: reproducible builds require reproducible process executions. and that requires per-process isolation of filesystem state.

sarah tonin :wlfBlep:

@SRAZKVT@tech.lgbt · 3 hours ago

@hipsterelectron plan9 has per process filesystems ig

@hipsterelectron@circumstances.run · 3 hours ago

@SRAZKVT keyKOS kinda does

@hipsterelectron@circumstances.run · 3 hours ago

@SRAZKVT omg ugh NOBODY ever tries the literal only thing i want for perf optimization https://doc.cat-v.org/plan_9/4th_edition/papers/fs/

The file system server processes prevent deadlock in the buffers by always locking parent and child directory entries in that order. Since the entire directory structure is a hierarchy, this makes the locking well-ordered, preventing deadlock. The major problem in the locking strategy is that locks are at a block level and there are many directory entries in a single block. There are unnecessary lock conflicts in the directory blocks. When one of these directory blocks is tied up accessing the very slow WORM, then all I/O to dozens of unrelated directories is blocked.

@hipsterelectron@circumstances.run · 3 hours ago

@SRAZKVT literally i'm so upset bc:

making my writes visible to other processes should absolutely happen in an atomic transaction
persisting my writes to disk is (1) a completely different fucking thing than IPC (2) should also happen atomically

@hipsterelectron@circumstances.run · 2 hours ago

@SRAZKVT literally nobody has ever asked filesystems to act like a lock-free OS-global hash table. that's a ConcurrentHashMap that's not a "filesystem"

sarah tonin :wlfBlep:

@SRAZKVT@tech.lgbt · 2 hours ago

@hipsterelectron well there's a reason why ska, navi, mercurial, git, and i all use it as a hashmap

@hipsterelectron@circumstances.run · 2 hours ago

@SRAZKVT do git/ska/navi/hg/you map pages with DIRECT_IO for that?

@hipsterelectron@circumstances.run · 2 hours ago

@SRAZKVT thanks for identifying this, that's definitely worth supporting (direct block writes) but i feel like that would make more sense to expose as a completely separate resource from the standard filesystem tree.

hmmmmmm actually, i'm not sure about that! if i want transactional semantics across file paths, i also want transactional semantics within a single file path. the pattern of explicit resource request => blocking commit syscall to establish ordered transaction boundaries should be able to apply to changes within a single file too.

and if my goal is to synchronize changes to disk as a transaction (as opposed to just IPC propagation), then i should have to specify that when i request a sync context to expose to a process (e.g. within a subprocess spawn call). cc @miss_rodent

@hipsterelectron@circumstances.run · 2 hours ago

@SRAZKVT @miss_rodent i also think a subprocess spawn should be a two-phase operation. first identify the sync context(s?), the imported paths from each, the executable path among those imports, the command line, the environment, and any initial export paths per sync context, and then the sync domain for each (in-memory, disk-persisted, both, maybe remote mirroring too?).

that initial "spawn" call would return an opaque handle. then you can do some other things in the meantime if you want. then it's another blocking syscall that waits for the task to be assigned a PID (no guarantees about scheduling).

maybe you could actually have a third phase too where you request some amount of cpu time. and then you could have a blocking syscall that waits for the PID to exhaust its assigned cpu time. that would actually be........a much more natural way to schedule than trying to infer cpu scheduling from i/o dependencies??????

@hipsterelectron@circumstances.run · 2 hours ago

@SRAZKVT @miss_rodent obv these sequential phases should be possible to both request and then wait for within a single blocking call. but since the returned "handle" would have a different type for each, i would probably just have four distinct spawn methods. maybe this is a job for mutable out-pointers though

@hipsterelectron@circumstances.run · 2 hours ago

@SRAZKVT @miss_rodent oooooooooo omg so scheduling cpu time slices according to the dependencies inferred from waitpid() calls is sooooooo much easier and simpler. that actually aligns the application task dependency graph directly with a build tool task dependency graph. and spawn()/waitpid() calls are like coroutine yield points, except they don't need to block the current process. so this is actually more powerful than the pants task graph bc it's not limited to python coroutine semantics

@hipsterelectron@circumstances.run · 2 hours ago

@SRAZKVT @miss_rodent requesting a fixed-size CPU time slice wouldn't be the only possibility. "run until complete" sounds reasonable too. and maybe you could have a blocking waitpid() call that works like a compare-and-swap loop, calling a provided process-local function pointer every time the previous slice completes, to either request a new slice or exit with a timeout error. and it returns with either the process exit status or the timeout error from the invoked function pointer. and the function pointer could do arbitrary stuff including blocking on some other call

@hipsterelectron@circumstances.run · 2 hours ago

@somebody [reprimanding my user application processes] blocked blocked blocked blocked blocked. NONE of you are free of sin

@hipsterelectron@circumstances.run · 1 hour ago

omg yet another way simon peyton-jones was wrong. cyclic task dependencies are problematic if you're a build tool and you expect to be able to infer the entire toposorted sequence of task invocations offline before executing them.

but if you are an operating system, there is no concept of global task completion (perhaps when PID 1 exits? do we still need a PID 1?), and you are constantly (in an online manner) selecting which live+active [with a nonzero remaining time slice] tasks to schedule on the available cores.

AND THE TASK SCHEDULER COULD BE A USERSPACE PROCESS TOO???????

@hipsterelectron@circumstances.run · 1 hour ago

i think a task scheduler process/service should be allocated from ring 0 and should have uncontested ownership over a set of hardware cores (i.e. no overlaps with other schedulers). and the scheduler should also have uncontested ownership over a set of live (not necessarily active) application processes.

and maybe application tasks can be explicitly+atomically migrated across schedulers? and maybe cores can be migrated across schedulers too.

and maybe there's an analogous kind of scheduler process for OS tasks (drivers, i/o sync)? so it can migrate cores with application schedulers, but not tasks (bc OS services are expected to have idiosyncratic scheduling requirements).

@hipsterelectron@circumstances.run · 1 hour ago

hmmmmmm......................uncontested ownership of cores sounds silly, because we will generally want to maintain some kind of task-core local affinity (so virtual memory + i/o buffers from those tasks remains warm in per-core L1 cache).

this kind of structured locality is why we invented named i/o sync contexts for i/o state (and the transactional request=>commit model). so two questions arise here:

should i/o sync contexts be linked to scheduler sync contexts?
should scheduler processes be linked to cores, or tasks?

sarah tonin :wlfBlep:

@SRAZKVT@tech.lgbt · 2 hours ago

@hipsterelectron hipsteros time

@hipsterelectron@circumstances.run · 2 hours ago

@SRAZKVT i was looking at seL4 to steal their kernel bootstrap but then i saw they use google repo to manage their repos and i thought hmmmm no

@hipsterelectron@circumstances.run · 2 hours ago

@SRAZKVT i know QEMU is the standard way to fuck around

@hipsterelectron@circumstances.run · 2 hours ago

@SRAZKVT i was thinking at first that the i/o sync logic (i.e. managing all my little fractal zip journals) would be in ring 0 but the thing about having a contiguous-memory serializable representation of i/o state is that it can be moved across processes and that's obviously the right thing to do

@hipsterelectron@circumstances.run · 2 hours ago

@SRAZKVT literally it's so great to do atomic blocking transactional semantics for i/o propagation bc instead of making my brain hurt thinking about atomics i can just write synchronous logic against the C abstract machine