Post · bonfire.cafe

@hipsterelectron@circumstances.run · 11 hours ago

are filesystems on linux just safer than the network subsystems or do filesystems just never expose any interface besides posix i/o so they have a much smaller and better-characterized attack surface?

d@nny disc@ mc²

@hipsterelectron@circumstances.run · 3 hours ago

hm so a "thread" is mostly memory context? but then it's less of an "OS thread" than a "cpu thread" imho

d@nny disc@ mc²

@hipsterelectron@circumstances.run · 3 hours ago

the uvm phd dissertation begins with a tirade on copying and i feel i may have embarrassed myself saying zero-copy is ridiculous. but also this is 1998 and multicore wasn't a thing so each copy was a synchronous computation

d@nny disc@ mc²

@hipsterelectron@circumstances.run · 3 hours ago

Finally, data copying often flushes useful information out of the cache. Since the CPU accesses main memory through the cache, the process of copying data fills the cache with the data being copied — displacing potentially useful data that was resident in the cache before
the copy.

oh i'll be upset if the CPU manages the cache without me. i bet it does

d@nny disc@ mc²

@hipsterelectron@circumstances.run · 3 hours ago

so when people say memory bandwidth is often a limiting factor that's because the OS doesn't actually manage the cache? i guess this makes sense, especially with atomic coherency (i know that's in the cpu)

~22 more replies

d@nny disc@ mc²

@hipsterelectron@circumstances.run · 3 hours ago

i was gonna say "what if they let me manage the cpu cache. just a little. as a treat" but x86 absolutely does have prefetch hint instructions! i remember this for reasons

d@nny disc@ mc²

@hipsterelectron@circumstances.run · 3 hours ago

see i will absolutely get all pouty and demand to control the kernel's i/o mechanics but the processor is cute it's trying its best i'm fine with a hint

d@nny disc@ mc²

@hipsterelectron@circumstances.run · 3 hours ago

so far it still seems that i am perfectly allowed to avoid linux's horrific decision to make entries in the "page cache" also act as a shared global mutable reference to a file region

d@nny disc@ mc²

@hipsterelectron@circumstances.run · 3 hours ago

i'm sorry i'm literally not doing that. global address space within a process? let's do it. the c standard gives me no choice. but also i can whisper in my cpu's ear and it remaps it! that's groovy. i'm with it.

but cross-process shared memory? accessible via uniform global path string? i will explicitly map a shared memory page if i want that!!!!

d@nny disc@ mc²

@hipsterelectron@circumstances.run · 3 hours ago

not only is "immediately globally visible" an excessively uncommon requirement, there is literally no alternative

d@nny disc@ mc²

@hipsterelectron@circumstances.run · 3 hours ago

i do wish memcpy in the kernel could somehow be cast asynchronously and receive an async result

d@nny disc@ mc²

@hipsterelectron@circumstances.run · 3 hours ago

that would require the processor to have a request/response flag system but it would not be a goddamn event loop

d@nny disc@ mc²

@hipsterelectron@circumstances.run · 3 hours ago

splice() shouldn't be a page mapping operation because it shouldn't occur in a single synchronous call. you should have to inform the kernel beforehand of your intent to splice

d@nny disc@ mc²

@hipsterelectron@circumstances.run · 3 hours ago

it loses more latency through a ring buffer but here's the thing if you can losslessly represent the dependency graph between operations in the kernel you don't need to do anything synchronously. just let your action graph swell from one process execution and then identify the paths necessary to resolve the new set of action dependencies. will probably require a model of expected runtime for each data motion operation........this sounds hard but actually doable

d@nny disc@ mc²

@hipsterelectron@circumstances.run · 3 hours ago

i did notice that io_uring required you to register files and buffers to avoid thrashing every time you read from an fd and i was like wowwwwwwww if only there were some magical way to provide task-local scratch space to the kernel that avoided thrashing by putting the task to sleep

d@nny disc@ mc²

@hipsterelectron@circumstances.run · 2 hours ago

legitimately i feel libc would be easier than posix i/o since i'm guessing that fread/fwrite do not require instantaneously broadcasting data through the page cache since i believe they have their own buffers. i'll go check posix grumble