The UNIX system has been in wide use for over 20 years, and has helped to define many areas of computing.
Post
Another strategy for pathname searching is to ship partial pathnames to foreign sites so they can do the expansion locally, avoiding remote directory opens and network transmission of directory pages.
yes!!!!! yes!!!!! locality!!!!!!
Such a solution is being investigated but is more complex in the general case
QUITTER
The important concept of atomically committing
changes has been imported from the database world
and integrated into LOCUS.
i'm gonna call myself a database guy now
omg SHIT it has both COMMIT and ABORT!
cc @zwol finally i have found some precedent for filesystem transactions (popek 1981) https://www.cs.princeton.edu/courses/archive/fall03/cs518/papers/locus.pdf this is a distributed filesystem with many similarities to current ones. as you can imagine it's purely regarding written pages but i think an ordered sequence of local i/o operations eventually merged into a wider (possibly global) synchronization context is an extremely reasonable thing to consider
A queue of propagation requests is kept by the kernel at each site and a kernel process services the queue.
literally the pants build tool
Propagation is done by "pulling" the data rather
than "pushing" it.
so this is generally how you would analyze a dependency graph but also this is like state of the art 1981 network shit and in that context this also demonstrates an early mitigation to thundering herd issues
BRUH
Given this commit mechanism, one is always left
with either the original file or a completely changed
file but never with a partially made change, even in
the face of local or foreign site failures. Such was
not the case in the standard Unix environment.
UNIX HATER'S HANDBOOK LOST CHAPTER
However, due to the potential for replicated storage of a new file, the create call needs two additional pieces of information - how many copies to store and where to store them.
you're never gonna believe what i emailed posix about earlier !!
except in my model replication is inferred by the system from the highly structured i/o agenda, which is provided at "compile time" before executing the task graph. so the extra context for me is instead a specific named "synchronization context" which generally corresponds to the virtual address space and visible filesystem contents for a set of tasks
(the sync context is itself a graph structure.)
Adding such information to the create call would change the system interface so instead defaults and per process state information is used, with system calls to modify them.
i'm not afraid of EEE
you could argue that sync context could be inferred from the i/o agenda too but:
- this is actually the only way my system supports state sync i.e. shared memory
- anonymous sync contexts are used to establish "chroots" for atomic i/o sequences
- if the user has their own ideas about scheduling, listen to them!!
the day win wang and i changed the world together at twitter inc was when we realized the infrastructure we'd constructed couldn't do context-specific locality. win's rsc scala compiler was DAG-scheduled and needed to share memory in a persistent local jvm. once it generated an outline to compile against, we could farm out embarrassingly parallel AOT-compiled scalac jobs (using the scoot RPC system from drew gassaway).
pants couldn't do that—yet. but we did it together. immediate 2x improvement before any further optimization. that's why compiler and build tool devs need to work together!!!
good thing too, cause it was like 3 days before this talk https://youtube.com/watch?v=87K4_v2IvBg it's not me giving it but there's a point where he shows the zipkin traces we constructed and you can just see the parallelism explode like a dubstep drop
This algorithm is localized in the code and may change as experience with replicated files grows.
also of course this project treats inodes like a real thing that has meaning
The storage site allocates an inode number from a pool which is local to that physical container of the filegroup. That is, to facilitate inode allocation and allow operation when not all sites are accessible, the entire inode space of a filegroup is partitioned so that each physical container for the filegroup has a collection of inode numbers that it can
allocate.
ted ts'o screaming crying throwing up rn
When all the storage sites have seen the delete, the inode can be reallocated by the site which has control of that inode (i.e. the storage site of the original create).
so this is legitimately the reason you want inodes like internally right? to do localized resource indexing! and like ok, but:
- if that's the purpose of the inode, then don't fucking expose it to userspace? if i use an inode that way, it's not gonna be in my dirents!
- the user still deserves an external inode! generate it completely differently! recycle it in its own way!
Solutions to the number representation and byte ordering problems have not yet been implemented.
LMAO