The adler32 checksum algorithm introduces a (seemingly useless) multiplication by 1 so that it can use a VPMADDWD widening multiply + add instruction.
After a recent change to stdarch, the multiplication got optimized out before that instruction could be selected, leading to much worse performance.
So I added some logic that adds the multiplication if that is the only thing stopping the more optimal instruction from being selected.
https://github.com/llvm/llvm-project/pull/174149/changes
(the stdarch change was reverted for now, but with LLVM 22 we can re-land it without regressing performance)