so i'm thinking, step 1 is adding "inline" buffers (a solution from soufflé), for which cases don't generate code, and which in usage get inlined directly into other cases.
this already covers the need i have for functional macros and i can put off supporting C exports to a later point.
# a more practical example
inline mtof f32 f32
case
then mtof m (mul 440.0 (powf 2.0 (div (sub m 69.0) 12.0)))
inline addmul (x : s32) (y : s32) (z : s32) (out : s32)
case
t1 = mul x y
t2 = add t1 z
then addmul x y z t2
# usage
case
src x y
addmul x y 1 w
then dst w
# internally expands to
case
src x y
merge
case
t1 = mul x y
w = add t1 1
then dst w