The UNIX system has been in wide use for over 20 years, and has helped to define many areas of computing.
Post
i am quite surprised why they remain stuck in the mode of trying to adjudicate a best-effort answer to mutually conflicting modifications. like to me "conflicts" e.g. writing to the same location indicate that there is an unresolved user-level "conflict" between their scheduled tasks!
and this is where aaron turon's formal verification framework would say: the behavior is incorrect! it produces a data race!
that's why i proposed the named hierarchy of sync contexts (up to global). it takes our environment isolation mechanism and gives the user a structured resource (with a lifetime) they can use to express a sequence of modifications that can be unambiguously verified for correctness.
what is correctness?
in each sync context, modifications are built up as an ordered sequence of i/o operations, then explicitly committed. as these operations are evaluated during the blocking commit call, the only correctness requirement is that a named resource (file path) previously committed to the same context cannot match the name of a new resource.
note that this arises not as we engage in attempting to write a whole bunch of data, but when we allocate (e.g. open()) a resource handle. this should fail eagerly, and quickly, and unambiguously indicate a problem with user input, or with the input environment!
OMG NOOOOOOO I JUST READ THE NEXT PAGE THIS IS A MIND MELD
The second case is the one that gives rise to
transactions. Here it is recognized that changes to
sets of objects are related.
LOCALITY GANG!!!!!
Reconciliation of differing versions of an object must be coordinated with other objects and the operations on those objects which occurred during partition.
name collision indicates a model failure—it can't be expected to be resolved by just choosing one version!
In addition, LOCUS provides a full nested transaction facility for those cases where the user wishes to bind a set of
events together.
"full nested transaction facility" is exactly what people writing build processes have needed for DECADES
Case specific merge strategies have been developed.
🤩🤩🤩🤩 that's me rn
emailing posix subject "full nested transaction facility" link to this pdf send
(not really; i aborted the transaction. see how broadly this can be applied?)
ok so i was vaguely thinking earlier that like "this network partitioning stuff is more intense than my purely-local use case" but another cited paper on mutual inconsistency detection had this really clever line: https://www.cs.purdue.edu/homes/bb/cs542-11Spr/Parker_TSE83.pdf
The most typical response is to enforce consistency by permitting files to be accessed only in one partition.
this is precisely what i was imposing when describing name conflicts as a model error!
Unfortunately, effective implementation of this policy can often result in the files being accessible in zero partitions!
in our case, this could mean...not persisting to disk before a crash! in the posix email, i described a hierarchy, where syncing to a named context must occur before persisting. that was wrong!
oh no.......oh dear........what if........what if this means i do need to persist to disk like the other filesystems i derided so eagerly
i think if i end up with a really thoughtful set of heterogenous interacting processes managing custom-built data structures both in-memory and on-disk like zfs does..........maybe i can accept that. jvm bytecode is easily the best fucking IR humanity has ever achieved
jar files are zip files because sun microsystems understands that the most powerful journaled filesystem......is the one you carry with you in your heart every day
i hope google succeeds in getting python to switch off zips to .tar.zsts so i can roll out my Zip File From the Future with tree hashing for fast splitting and merging along with the merkel-damgård length extension proof of concept
but i think they won't, because the zip index is too useful. zip files are literally just tarballs with an index. undefeatable
the future part is that my zip can be made to allow either leading or trailing bytes, so it can avoid clobbering its own index when doing appends, but still supports self extracting executables and still foils length extension attacks
it's really simple actually it's not really like an achievement or something but it's really really sick to have a data layout that's immediately readable, immediately appendable without blocking, and has unambiguous boundaries so splitting into subsets, or merging entries from multiple constituents, are all things you can do fearlessly and frequently
a lot of the classic kernel data structures are specifically intended to do two things i really don't care about:
(1) maintain atomic read/write coherency from separate processes over a contiguous region of shared memory
(2) infer likely future i/o access patterns
(1) i simply refuse to accept as a valid behavior (referring to write()/read() atomicity and serialization for overlapping regions of the same physical page). i think that's literally just obvious UB if if was in the same address space?
it's a textbook math problem where the right answer is "not enough information"