are filesystems on linux just safer than the network subsystems or do filesystems just never expose any interface besides posix i/o so they have a much smaller and better-characterized attack surface?
Post
hm so a "thread" is mostly memory context? but then it's less of an "OS thread" than a "cpu thread" imho
the uvm phd dissertation begins with a tirade on copying and i feel i may have embarrassed myself saying zero-copy is ridiculous. but also this is 1998 and multicore wasn't a thing so each copy was a synchronous computation
Finally, data copying often flushes useful information out of the cache. Since the CPU accesses main memory through the cache, the process of copying data fills the cache with the data being copied — displacing potentially useful data that was resident in the cache before
the copy.
oh i'll be upset if the CPU manages the cache without me. i bet it does
so when people say memory bandwidth is often a limiting factor that's because the OS doesn't actually manage the cache? i guess this makes sense, especially with atomic coherency (i know that's in the cpu)
i was gonna say "what if they let me manage the cpu cache. just a little. as a treat" but x86 absolutely does have prefetch hint instructions! i remember this for reasons
see i will absolutely get all pouty and demand to control the kernel's i/o mechanics but the processor is cute it's trying its best i'm fine with a hint
so far it still seems that i am perfectly allowed to avoid linux's horrific decision to make entries in the "page cache" also act as a shared global mutable reference to a file region
i'm sorry i'm literally not doing that. global address space within a process? let's do it. the c standard gives me no choice. but also i can whisper in my cpu's ear and it remaps it! that's groovy. i'm with it.
but cross-process shared memory? accessible via uniform global path string? i will explicitly map a shared memory page if i want that!!!!
not only is "immediately globally visible" an excessively uncommon requirement, there is literally no alternative
i do wish memcpy in the kernel could somehow be cast asynchronously and receive an async result
that would require the processor to have a request/response flag system but it would not be a goddamn event loop
splice() shouldn't be a page mapping operation because it shouldn't occur in a single synchronous call. you should have to inform the kernel beforehand of your intent to splice
it loses more latency through a ring buffer but here's the thing if you can losslessly represent the dependency graph between operations in the kernel you don't need to do anything synchronously. just let your action graph swell from one process execution and then identify the paths necessary to resolve the new set of action dependencies. will probably require a model of expected runtime for each data motion operation........this sounds hard but actually doable
i did notice that io_uring required you to register files and buffers to avoid thrashing every time you read from an fd and i was like wowwwwwwww if only there were some magical way to provide task-local scratch space to the kernel that avoided thrashing by putting the task to sleep
legitimately i feel libc would be easier than posix i/o since i'm guessing that fread/fwrite do not require instantaneously broadcasting data through the page cache since i believe they have their own buffers. i'll go check posix grumble