@SRAZKVT you can use my fractal zip format for this purpose actually (one of the design goals was representing a dataset with changes over time). that's actually a really fascinating corollary of using it to represent atomic transitions between reproducible filesystem states....................i did not realize that VCS is precisely a formulation of atomic filesystem transactions. need to think about this further
if we accept that a cryptographic checksum is "proof" of reproducibility, then a signature scheme is strictly more powerful--proof of reproducible output, and proof that the output checksum was not modified after being generated by the holder of the private key!
let's take a look at that patch series now.
lwn complains about AI scrapers when you try to access their locally hosted copy of diffs from LKML. https://web.archive.org/web/20250409044448/https://lwn.net/ml/all/20250120-module-hashes-v2-2-ba1184e27b7f@weissschuh.net/
(why not just link to LKML if they're getting scraped so hard? if you ask questions like this you will not like the answers you find)
this diff [2/6 in the patchset] adds a new config option that disables the existing config option to enforce signature checking. real Kconfig heads will understand that this config dependency is equivalent to an override mechanism. so you can disable module signing even if the user config requires it.
that's not even the cryptographic part yet, just an extra build system backdoor. the cryptographic claims are next.
this is the magnum opus of the reproducible builds evangelism strike force: https://web.archive.org/web/20250408191140/https://lwn.net/ml/all/20250120-module-hashes-v2-6-ba1184e27b7f@weissschuh.net/. let's evaluate these claims:
The current signature-based module integrity checking has some drawbacks in combination with reproducible builds:
drawbacks in combination? that makes it sound like reproducible builds are a simple config setting. that would be nice, right? if reproducible builds had precise semantics? and they fucked off and stopped bothering everyone else?
Either the module signing key is generated at build time, which makes the build unreproducible,
we're going to examine this claim in more detail presently. but first we absolutely need to highlight the rest of this sentence:
or a static key is used, which precludes rebuilds by third parties and makes the whole build and packaging process much more complicated.
i cannot possibly express the violent feelings within me upon reading this statement:
- a "static key" refers to "literally a normal key, the way it worked before".
- does it "preclude rebuilds by third parties"? (we will evaluate this below.)
- "makes the whole build and packaging process much more complicated" -- again, this is literally the way it works right now.
so the reproducible builds evangelism strike force get to whine about how complicated it is to make the build reproducible. if you hate your job then maybe choose a different line of work?
but [6/6] in this patchset has so much more to show us! here is the the reproducible build squadron's best and brightest, making things less complicated:
diff --git a/Documentation/kbuild/reproducible-builds.rst
b/Documentation/kbuild/reproducible-builds.rst
index f2dcc39044e66ddd165646e0b51ccb0209aca7dd..6a742ad745113a9267223b33810dbc7218c47d4c 100644
--- a/Documentation/kbuild/reproducible-builds.rst
+++ b/Documentation/kbuild/reproducible-builds.rst
@@ -79,7 +79,10 @@ generate a different temporary key for each build, resulting in the
modules being unreproducible. However, including a signing key with
your source would presumably defeat the purpose of signing modules.
-One approach to this is to divide up the build process so that the
+Instead ``CONFIG_MODULE_HASHES`` can be used to embed a static list
+of valid modules to load.
+
+Another approach to this is to divide up the build process so that the
unreproducible parts can be treated as sources:
1. Generate a persistent signing key. Add the certificate for the key
so, instead of forcing our brave and noble reproducible builds advocates to suffer the cruel and unusual punishment of "splitting up the build process", we now have the much less complex alternative of "adding a backdoor in Kconfig that short-circuits signature checking at runtime"
recall lwn's comment on this dastardly mathematical trickery:
It's tempting to search for a clever cryptographic solution, but nobody has yet proposed one.
i had earlier today developed a lengthy protocol that would have solved a much harder problem, but i've just realized i was still giving them far too much credit.
this isn't a "clever cryptographic solution". they're using deterministic fucking signatures. reproducing the build just means providing a private key to the build process, the method that is already supported for module signing.
it's right there in Documentation/admin-guide/module-signing.rst:
(4) :menuselection:`File name or PKCS#11 URI of module signing key`
(``CONFIG_MODULE_SIG_KEY``)
Setting this option to something other than its default of
``certs/signing_key.pem`` will disable the autogeneration of signing keys
and allow the kernel modules to be signed with a key of your choosing.
you can even sign off by hand:
To manually sign a module, use the scripts/sign-file tool available in
the Linux kernel source tree. The script requires 4 arguments:
1. The hash algorithm (e.g., sha256)
2. The private key filename or PKCS#11 URI
3. The public key filename
4. The kernel module to be signed
so reproducible builds advocates claim to understand cryptography enough to interpret a cryptographic checksum, but they don't seem to understand that signatures are strictly more powerful proofs than checksums, even though the module signing guide makes it quite clear that signatures are just checksums++.
so, what do they mean by "rebuilds", and what does the CONFIG_MODULE_HASHES backdoor let them achieve?
well, first let's recall the final word of warning from module-signing.rst:
Since the private key is used to sign modules, viruses and malware could use the private key to sign modules and compromise the operating system.
well, i wouldn't go so far as to call arch linux distro packagers "viruses and malware". that's a little harsh.
The private key must be either destroyed or moved to a secure location and not kept in the root node of the kernel source tree.
this is, of course, exactly what CONFIG_MODULE_HASHES does, enabling viruses and malware to sign modules and compromise the operating system.
but surely i'm mistaken. let's check lwn again:
If the signing keys are publicly available for use in recreating the build, malicious actors could also sign modified loadable modules with them.
that is indeed exactly what module-signing.rst warned us about!
If they aren't publicly available, the build can't be reproduced.
you see what this means right? "reproduced" doesn't mean "reproduced". "reproduced" means "malicious actors can also sign modified loadable modules".
why do i refer to CONFIG_MODULE_HASHES as a backdoor? because it overrides signature checking by coming before it:
diff --git a/kernel/module/main.c b/kernel/module/main.c
index effe1db02973d4f60ff6cbc0d3b5241a3576fa3e..094ace81d795711b56d12a2abc75ea35449c8300 100644
--- a/kernel/module/main.c
+++ b/kernel/module/main.c
@@ -3218,6 +3218,12 @@ static int module_integrity_check(struct load_info *info, int flags)
{
int err = 0;
+ if (IS_ENABLED(CONFIG_MODULE_HASHES)) {
+ err = module_hash_check(info, flags);
+ if (!err)
+ return 0;
+ }
+
if (IS_ENABLED(CONFIG_MODULE_SIG))
err = module_sig_check(info, flags);
this is all trying to dance around two very subtle points that require a very specialized technical understanding of cryptography to infer:
- the signing keys are secret because downstream distro packagers and/or corporate sysadmins are the malicious actors which module signatures protect against!
- more importantly, cryptographic signatures are just unspoofable checksums!
the fact that they're not "reproducible" is because they use secret data (the private key) to stop "malicious actors" from generating new checksums for "modified loadable modules"!
claiming a cryptographic signature is "nonreproducible" is a non sequiter--they are literally just a list of module checksums. it's the exact same fucking thing, except there is an additional cryptographic proof that modules haven't been modified since they left the custody of the key owner.
lwn, kpcyrd, Thomas Weißschuh, and everyone associated with the module hashing for "reproducibility" is either completely unaware of how cryptography works (and therefore should not be trusted with crypto), or they are lying in order to backdoor linux users (and therefore should not be trusted with crypto)