Post · bonfire.cafe

Post

d@nny disc@ mc²

@hipsterelectron@circumstances.run · 4 months ago

of course the toml crate has serde on it

Federation Bot

@Federation_Bot replied · 4 months ago

cargofuckyourself

d@nny disc@ mc²

@hipsterelectron@circumstances.run replied · 4 months ago

[morpheus.jpg] what if i told you

you could build an extensible task system

outside the build system

d@nny disc@ mc²

@hipsterelectron@circumstances.run replied · 4 months ago

the reason i'm thinking about this is because:
(1) this is also the exact thing that pip needs for parallel cacheable package finding
(2) the one thing i was always annoyed about from pants (and had spent quite some time attempting to fix) was how it stuffed like 300 separate individual great ideas into the single monolithic cli

d@nny disc@ mc²

@hipsterelectron@circumstances.run replied · 4 months ago

especially pants demonstrates a very powerful request-response model with a cyclic task dependency graph. as pants grew bigger and added more functionality, the reliance on a persistent daemon became self-reinforcing

d@nny disc@ mc²

@hipsterelectron@circumstances.run replied · 4 months ago

pants would previously be able to make use of a ./pants shell script at the root of the repo (i still have the muscle memory). it now employs a scie jump, which is a system for bootstrapping executables that can be thought of as a generalization of pex zipapps

d@nny disc@ mc²

@hipsterelectron@circumstances.run replied · 4 months ago

the point is that even the monolithic pants is now assembled from a specification. and i think furthermore that such an assembly process can be a component of hooking up toolchains that don't want to talk to each other

~65 more replies (not shown)

d@nny disc@ mc²

@hipsterelectron@circumstances.run replied · 4 months ago

consider autoconf. the thing i've wanted to hack into it for ages is parallel test evaluation. the part i hadn't considered was how the result of a test can often be cached globally. when is this not possible? this can be answered if tests are able to specify distinct inputs i.e. dependencies. autoconf sadly does not have a task system

d@nny disc@ mc²

@hipsterelectron@circumstances.run replied · 4 months ago

cargo's problems are legion:
(1) build scripts can't do IPC to other build scripts
(2) build scripts can't fetch resources the way they can other rust code
(3) build scripts don't have any shared output space where one package can compute an output that's shared with others, even in the same dep graph

d@nny disc@ mc²

@hipsterelectron@circumstances.run replied · 4 months ago

the way packages can and do work around this is to simply write to a known filesystem path as a primitive form of IPC. this is highly problematic

d@nny disc@ mc²

@hipsterelectron@circumstances.run replied · 4 months ago

pip's case is interesting too:
(1) it has a specific fetch-parse-cache process it needs to perform before it can do any resolve work
(2) it needs to be portable, so it can't add its own native code
(3) it has builds and fetches that can occur in parallel
(4) it wants to avoid being a build system like cargo, but it ends up propagating build provenance through its outputs

d@nny disc@ mc²

@hipsterelectron@circumstances.run replied · 4 months ago

the pants project used to have a set of binary bootstrap scripts for things like gcc (that's the one i maintained for twitter's c++ code)

d@nny disc@ mc²

@hipsterelectron@circumstances.run replied · 4 months ago

the reason this arose was:
(1) i need to build zstd if not available
(2) i need to build curl if not available
(3) i need to build make if not available

there are distinct contexts separating all of these! a bootstrap make doesn't need to have guile extensions. a bootstrap curl doesn't need to have zstd built in. a bootstrap zstd doesn't need multithreading

d@nny disc@ mc²

@hipsterelectron@circumstances.run replied · 4 months ago

a task system for recursive bootstrapping (generating shared resources like interpreters and libraries) that also fetches and caches...........

d@nny disc@ mc²

@hipsterelectron@circumstances.run replied · 4 months ago

pants had this separation between file/directory content (stored persistently in a merkle tree in a k/v db by checksum key) and in-memory build products. one problem with it was that it stored absolutely every single intermediate file and directory used over the course of a bootstrapping process. really wasteful and necessitated the obnoxious LMDB store. this was the passion project of the bazel engineer who convinced twitter to switch to bazel

d@nny disc@ mc²

@hipsterelectron@circumstances.run replied · 4 months ago

but a persistent file store that stores important nodes is a really sick idea. i just earlier built a terrible "resource" fetching abstraction for cargo build scripts and realized wait this is kind of like the pants file store? and then recalled that spack has one of these too, but it stores specific intermediate directory outputs

d@nny disc@ mc²

@hipsterelectron@circumstances.run replied · 4 months ago

i think for complex tasks that python is the right way to achieve extensibility. no glorified yaml like starlark with its own mysterious semantics. it's gonna be a real programming language

d@nny disc@ mc²

@hipsterelectron@circumstances.run replied · 4 months ago

but the thing autoconf does achieve is not requiring you to have a python interpreter beforehand, which is why it can be used to build the python interpreter

Federation Bot

@Federation_Bot replied · 4 months ago

one other curious thought from the pip case: there may be information that should be globally available, but is highly inefficient in distinct file outputs--it needs to be synchronized into some shared state (like a sqlite db). this is how the available versions for a python package can be made queryable

d@nny disc@ mc²

@hipsterelectron@circumstances.run replied · 4 months ago

that's notable because it represents a different kind of state and a different type of dependency than other tasks

d@nny disc@ mc²

@hipsterelectron@circumstances.run replied · 4 months ago

in the case of querying python indices, it represents the world state at that time

d@nny disc@ mc²

@hipsterelectron@circumstances.run replied · 4 months ago

(importantly, the "world state" paradigm can also be applied to other forms of global mutable state like the filesystem. this is one problem with spack externals -- they don't have any concept of invalidation)

d@nny disc@ mc²

@hipsterelectron@circumstances.run replied · 4 months ago

so where does it end?

d@nny disc@ mc²

@hipsterelectron@circumstances.run replied · 4 months ago

if we want to use this bootstrap zstd, we have:
(1) use pkg-config to find one matching the desired version spec
(2) bootstrap a known version from source

spack is a massive piece of machinery to resolve a dependency graph with versions of c and c++ and python etc. it's not worth anyone's time to recreate from scratch. but spack itself has bootstrap needs (it needs python)

d@nny disc@ mc²

@hipsterelectron@circumstances.run replied · 4 months ago

it would be impossible to reproduce spack-style resolves without spack itself. but we could wrap spack, after bootstrapping its dependencies

d@nny disc@ mc²

@hipsterelectron@circumstances.run replied · 4 months ago

omg we could bootstrap a rust toolchain. that'd be so fun and flirty

d@nny disc@ mc²

@hipsterelectron@circumstances.run replied · 4 months ago

if we can bootstrap python, we can bootstrap pip

d@nny disc@ mc²

@hipsterelectron@circumstances.run replied · 4 months ago

guix has already gone down the reproducibility path with gnu mes. we won. final boss defeated

d@nny disc@ mc²

@hipsterelectron@circumstances.run replied · 4 months ago

pants dug itself into a bit of a corner. in fact it's about to lose one of the most powerful aspects of the rule graph because scanning for internal (recursive) calls poses a performance problem, so the recursive calls need to perform the type-based resolution by hand https://github.com/pantsbuild/pants/issues/19730

d@nny disc@ mc²

@hipsterelectron@circumstances.run replied · 4 months ago

i had proposed defining rules in webassembly or even jvm bytecode before, because one of the biggest points of feedback from twitter engineers was "we don't understand this complex python system, we write scala code". this was one of the most reasonable and honest bits of feedback i've ever received

d@nny disc@ mc²

@hipsterelectron@circumstances.run replied · 4 months ago

how do you architect a system that crosses ecosystems?

d@nny disc@ mc²

@hipsterelectron@circumstances.run replied · 4 months ago

build tools have largely converged upon process executions as the shared unit of work and the filesystem as IPC. this is the most successful and portable FFI i am aware of

d@nny disc@ mc²

@hipsterelectron@circumstances.run replied · 4 months ago

process executions are also memory-safe by default, from the OS's segmentation. there are portable synchronization tools for the filesystem alone, and higher-performance ones depending upon the platform. there's a standard resource discovery mechanism, and a standard name assignment mechanism

d@nny disc@ mc²

@hipsterelectron@circumstances.run replied · 4 months ago

there are some very small downsides:
(a) the filesystem is a hardware resource, and therefore has some inherent limits
(b) the OS does its own bookkeeping at the vfs layer
(c) process execution alone imposes a mild overhead

so we have a very very general high-level database in the filesystem, and a very very general FFI mechanism in process execution. and these can be used to communicate between tools that don't otherwise have a shared protocol--like cargo and spack, or cargo and pip, or cargo and literally anything

d@nny disc@ mc²

@hipsterelectron@circumstances.run replied · 4 months ago

we can also add network requests to the above, and we'll have outlined a system that can bootstrap curl

d@nny disc@ mc²

@hipsterelectron@circumstances.run replied · 4 months ago

what if we could define tasks that could be fulfilled by a python function, a jvm function, or a process execution? what then?

d@nny disc@ mc²

@hipsterelectron@circumstances.run replied · 4 months ago

another of the many projects i've probably spent at least a month of 8 hour days on https://github.com/cosmicexplorer/upc this one tried to virtualize i/o

d@nny disc@ mc²

@hipsterelectron@circumstances.run replied · 4 months ago

i think i had it inside out https://github.com/cosmicexplorer/upc/blob/master/local/virtual-cli/client/VFS.scala instead of making everything conform to the process/filesystem interface, make process/filesystem conform to the everything interface

sarah tonin :wlfBlep:

@SRAZKVT@tech.lgbt replied · 4 months ago

@hipsterelectron cargo

at least it's not cmake

0xC0DEC0DE07E9

@c0dec0dec0de@hachyderm.io replied · 4 months ago

@hipsterelectron there’s always merde (nope, he started facet, that’s right)
https://github.com/bearcove/merde

bonfire.cafe

A space for Bonfire maintainers and contributors to communicate

bonfire.cafe: About · Code of conduct · Privacy · Users · Instances

Bonfire social · 1.0.2-alpha.7 no JS en

Automatic federation enabled