Post by @hipsterelectron@circumstances.run

@hipsterelectron@circumstances.run 2 months ago

i feel that the grammar of a programming language is among the least appropriate of all possible facets of its behavior to start off with. why on earth would i care about your preferred tokens to represent concepts which have not yet been defined

d@nny disc@

@hipsterelectron@circumstances.run 2 months ago

omgggg NO THEY FUCKING DIDN'T MAKE THEIR RUNTIME BEHAVIOR A PRODUCT OF THEIR COMPILE-TIME SEMANTICS https://smlfamily.github.io/sml97-defn.pdf

Since signature expressions are mostly dealt with in the static semantics, the dynamic semantics need only take limited account of them.

this is so unserious. the static semantics don't even exist to me yet. i can't believe someone would write a document that claims to describe behavior without an explicit lowering process

View (PDF)

d@nny disc@

@hipsterelectron@circumstances.run 2 months ago

like this is what you would need if you want someone to interop their FFI with your language

d@nny disc@

@hipsterelectron@circumstances.run 2 months ago

Hitherto, the semantic rules have not exposed the interactive nature of the language.

i feel this assumes a great deal. this seems more of the semantic rules of a particular CLI program provided with the default distribution?

During an ML session the user can type in a phrase, more precisely a phrase of the form topdec as defined in Figure 8, page 15.

"during an ML session" omg
"the user can type": that is not meaningful without describing a whole lot more interactions with the OS

d@nny disc@

@hipsterelectron@circumstances.run 2 months ago

In practice, ML implementations may provide a directive as a form of top-level declaration for including programs from files rather than directly from the terminal.

FILES??? TERMINAL????

~107 more replies

d@nny disc@

@hipsterelectron@circumstances.run 2 months ago

So far, for simplicity, we have used the same notation B to stand for both a static and a dynamic basis, and this has been possible because we have never needed to discuss static and dynamic semantics at the same time.

i understood compilation to be the translation from an IR (static semantics) into the ABI (dynamic semantics)

d@nny disc@

@hipsterelectron@circumstances.run 2 months ago

C Appendix: The Initial Static Basis
In this appendix (and the next) we define a minimal initial basis for execution. Richer bases may be provided by libraries.

for "execution" means something different to the authors than it does to me

d@nny disc@

@hipsterelectron@circumstances.run 2 months ago

omg

At the same time, imperative features were important for practical reasons; no-one had experience of large useful programs written
in a pure functional style. In particular, an exception-raising mechanism was highly desirable for the natural presentation of tactics.

these are still a matter of grammar to me. "imperative features" is an interface to the programmer imho

d@nny disc@

@hipsterelectron@circumstances.run 2 months ago

The full definition of this first version of ML was included in a book [19] which describes LCF, the proof system which ML was designed to support.

literally omfg why didn't you send me there FIRST?????

d@nny disc@

@hipsterelectron@circumstances.run 2 months ago

Other early influences were the applicative languages already in use in Artificial Intelligence.

i am not the expected audience

d@nny disc@

@hipsterelectron@circumstances.run 2 months ago

i keep reading to find when i'm gonna find some discussion of semantics. hasn't happened yet

d@nny disc@

@hipsterelectron@circumstances.run 2 months ago

the reason i fell into this trap in the first place because i wanted to understand what "C formalized in HOL" was on about https://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-453.pdf

C also combines a number of interesting features on the theoretical front, making it additionally interesting as a subject of study.

this is not something i wanna hear from someone claiming to have formalized it

For example, C’s expressions both are side-effecting and have very under-specified evaluation orders. If these semantic features were the main area of interest in studying a language, then it would clearly be easier to construct a simple calculus that included these features and little else.

this is ridiculous. obviously these semantic features are not ideal when attempting to write code that runs e.g. in ring 0. yet people do it (and this is meaningfully outside the C standard). the UB becomes defined thanks to our friends who write the compiler. is it worth attempting to standardize ring 0 properties?

However, we prefer to attack as much of C as possible all at once. As Milner and Tofte point out in the commentary on the definition of SML [MT90], this study of languages in their entirety has its own grounds for interest, and we further feel that our study of C gives us a possible
application in the area of software verification.

they didn't even mention a single concrete implementation until the appendices

View (PDF)

d@nny disc@

@hipsterelectron@circumstances.run 2 months ago

This style of definition was used in the definition of Standard ML by Milner, Tofte and Harper [MTH90]. This example, one of the most famous formal language definitions, is a clear demonstration that a large language can be formalised in this manner.

i'm getting the impression that the seL4 HOL C semantics may not be as useful as it's being let on lmao

d@nny disc@

@hipsterelectron@circumstances.run 2 months ago

Precisely because the standard’s definition of C is not formal, we can never hope to prove our formal definition consistent with it.

actively violent and evil thing to say

d@nny disc@

@hipsterelectron@circumstances.run 2 months ago

At best we can hope that our definition comes to be seen as correct by the community of people concerned with C’s definition and standardisation.

this is now becoming kind of worrying. a formal semantics can be matched to the behavior from a compiler and our friends in the compiler and in our CPU architecture manuals can describe whether it matches the "formalization"

Such a community can perform very useful error-checking.

how does anyone write this stuff

d@nny disc@

@hipsterelectron@circumstances.run 2 months ago

In addition, if used as the basis for software tools that do not necessarily require a deep understanding of its details, a formal semantics may come to be accepted as correct simply because of what it has made possible in the pragmatic domain.

this is FUCKED! a formal semantics is not something you can bully people into accepting. jfc

d@nny disc@

@hipsterelectron@circumstances.run 2 months ago

A denotational semantics defines an appropriate mathematical space as a model for a language, and maps the language’s syntax into that space in a way that is compositional. This property requires that the semantics of a syntactic phrase be a function of the semantics of the phrase’s syntactic sub-components.

so "denotation semantics" is a made up interpretation that conforms to some fuckboy's idea of aesthetically pleasing. see i'm learning so much

d@nny disc@

@hipsterelectron@circumstances.run 2 months ago

it keeps going. now he's claiming to be the first to have invented the C abstract machine (operational semantics)

d@nny disc@

@hipsterelectron@circumstances.run 2 months ago

OMG

It is central to our thesis that the semantics of C is so complicated that it can only be usefully manipulated in the context of a theorem prover.

THE C STANDARD IS WRITTEN BY HUMANS? FOR HUMANS?

d@nny disc@

@hipsterelectron@circumstances.run 2 months ago

this is also pretty worrying because he dismissed earlier ever conforming with the C standard, and seL4 literally just asserts that its C code conforms to the model

d@nny disc@

@hipsterelectron@circumstances.run 2 months ago

like this was not some newfangled thing people started doing recently! people writing code that needs to validate nontrivial properties generally do it by actually getting their hands dirty and doing the work to link the compiler's internal semantics to the representation made in HOL or whatever

d@nny disc@

@hipsterelectron@circumstances.run 2 months ago

Without mechanical support, reasoning with a big semantics is error-prone, and it can be hard to be confident that one’s proofs are actually correct.

does he..........how does he think c compilers work

d@nny disc@

@hipsterelectron@circumstances.run 2 months ago

Using a theorem prover means that we are confident that all of the results we have proved are correct.

"correct"

Having used the theorem prover HOL [GM93], we are particularly confident, as this system, following the example of its ancestor system LCF [GMW79], uses the strong type system of ML to guarantee that values of type theorem are only produced in ways that are logically sound.

that's it. that's your persuasive essay???

d@nny disc@

@hipsterelectron@circumstances.run 2 months ago

he keeps mentioning like "yeah these theorems take a lot of effort to prove.......and often they're completely unusable too" like sir have you considered that things being difficult might indicate that you need to find a semantics engine that doesn't hate you

d@nny disc@

@hipsterelectron@circumstances.run 2 months ago

cambridge still batting 100% on being actively evil people who just write whatever they want on official letterhead

d@nny disc@

@hipsterelectron@circumstances.run 2 months ago

https://trustworthy.systems/publications/papers/Tuch%3Aphd.pdf

this one is hosted on the seL4 site

Systems impose on languages many abstraction breaking requirements

"systems" lmao

and are not usually considered amenable to implementation in higher-level languages like Java and ML.

yeah cause the JVM abstract machine is specifically built to be this fucked up carnival ride. you could do it if you specifically forked the JVM. hate this lack of precision from ppl who are so loud about "formalism"

For example, zero-copy I/O and address translation are crucial features

zero-copy IO and address translation are extremely different things. zero-copy IO doesn't even make sense in ring 0 and is not in fact a "crucial feature". it's not even a language feature!

and programmers demand the freedom to control data structure layout [87],

you can "control data structure layout" in any language that lets you address bytes which i think is literally all of them. C struct layout is actually rly annoying because you can't let the compiler help you at all

in particular when optimising the cache and TLB footprint that is typically opaque in such languages.

those aren't your data structures those are the CPU's and that's ring 0 again, not a language feature

Inside the research community there are recent promising efforts at harnessing the gains of the last three decades of programming language research [8, 22, 29, 37, 46, 68, 89],

guy who knows nothing about anything he just said: "i represent the 'research community' and we will exterminate your kind"

with an emphasis on types and static checking, when implementing systems.

this guy grew into the rust evangelism strike force

However, these advances are yet to be popularised in industry

guy who thinks "systems" are an industry-specific thing

and still face enormous scepticism from systems implementors who are highly obsessed
with efficiency, sometimes to the extreme where clock cycles are the metric of choice.

this fucking guy!!!!! clock cycles can actually be counted reliably lmao. THIS is what seL4 is standing behind

d@nny disc@

@hipsterelectron@circumstances.run 2 months ago

Even today, it is easy to violate the C type system by its cast mechanism and through address arithmetic.

guy who thinks C's type system is being violated through casting and address arithmetic. you know those have concrete semantics right

The programmer is given, intentionally, access to low-level bit and byte representations of values in memory.

again, that's literally every language

There are no checks on array bounds when indexing — this would violate C’s design philosophy.

the guy who is telling you with a straight face that he totally formalized C semantics for high-assurance ring 0 scenarios is now telling you he finds the language detestable

d@nny disc@

@hipsterelectron@circumstances.run 2 months ago

god it would be so cool if rust gave a shit about correctness

d@nny disc@

@hipsterelectron@circumstances.run 2 months ago

C does not have garbage collection and the programmer is responsible for allocation and deallocation of memory through library calls.

"library calls" why would you declare that you don't know the semantics at all

A systems implementor may even develop his or her own memory allocator that replaces this already low-level interface, enabling direct management of the physical memory in a system.

THIS IS THE GUY WHO IS CLAIMING HE KNOWS WHAT SEMANTICS ARE!

d@nny disc@

@hipsterelectron@circumstances.run 2 months ago

Unfortunately, systems code is by no means strictly conforming and we could say by definition requires the ability to violate the standard’s strict rules on how memory can be accessed.

i am literally going to go find the C standard right now because the model of globally addressable memory space is i'm pretty sure the one thing that's not violated

d@nny disc@

@hipsterelectron@circumstances.run 2 months ago

like personally i think someone (not this guy) could make a pretty effective case for having correctly represented the semantics of C in ring 0 in a theorem prover even if they didn't link it to precise lines of C code through a model in the compiler,,,,

but if i was ever gonna say anything like "high-assurance" or "secure" i would actually do the work to link my semantic model to the one in the compiler and the CPU/RAM. and i would bully c standards people into accepting it

d@nny disc@

@hipsterelectron@circumstances.run 2 months ago

As a result, when describing type safety with respect to a C program in this thesis, we refer to a looser notion,

bruh. don't say things like that

where we may require expressions that designate a memory object to have a type corresponding to the expected value stored in memory.

he should have said "type" to clarify that that was gonna be the subject of debate. but this guy represents the "research community" so i bet he thinks his type is Correct

Program fragments can be type-safe if all their expressions have this property and later we formalise what is meant by the expected value’s type.

"type-safe". usually in cryptography we don't invoke generic informal terminology when we want people to take us seriously

d@nny disc@

@hipsterelectron@circumstances.run 2 months ago

Memory management code tracks the free memory that can be allocated and also sometimes the memory that has been allocated.

he just keeps going??????? here i'll translate:

"the free memory that can be allocated": sometimes non-micro kernels like linux maintain free lists of unmapped physical pages so that moving the sbrk can be made very fast if not completely atomic
"and also sometimes the memory that has been allocated": i suspect this is referring to a process's virtual address mapping, but maybe it's referring to an in-kernel allocator

d@nny disc@

@hipsterelectron@circumstances.run 2 months ago

This is commonly done through pointer-linked data structures,

why are we still saying "pointer" when we're in ring 0???? that's a physical address buddy

and this use of what are also called mutable inductively-defined data structures

no citation here is so disrespectful lmao

is the cause of a great degree of the difficulty in reasoning about such code formally.

i'm sorry you're having difficulty maybe it's time to give it up???

d@nny disc@

@hipsterelectron@circumstances.run 2 months ago

This difficulty, a direct consequence of the use of indirection,

how are you still negging the reader like this

can be broken down as the aliasing [14] and frame [61] problems.

oh my GOD!!!!! ok so these fucking citations my god

[14] this is literally about virtual memory conforming to the C standard https://eis.mdx.ac.uk/staffpages/r_bornat/papers/MPC2000.pdf

The final difficulty is the complexity of the proofs: not only do we have to reason formally about sets, sequences, graphs and trees, we
have to make sure that the locality of assignment operations is reflected in the treatment of assertions about the heap.

EVEN THAT PAPER'S AUTHOR IS TELLING HIM TO DO HIS FUCKING JOB LOL

For all of these reasons, Hoare logic isn’t widely used to verify pointer programs. Yet most low-level and all object-oriented programs use heap pointers freely. If we wish to prove properties of the kind of programs that actually get written and used, we shall have to deal with pointer programs on a regular basis.

d@nny disc@

@hipsterelectron@circumstances.run 2 months ago

literally nothing will prepare you for [61]