i feel that the grammar of a programming language is among the least appropriate of all possible facets of its behavior to start off with. why on earth would i care about your preferred tokens to represent concepts which have not yet been defined
Post
In addition, if used as the basis for software tools that do not necessarily require a deep understanding of its details, a formal semantics may come to be accepted as correct simply because of what it has made possible in the pragmatic domain.
this is FUCKED! a formal semantics is not something you can bully people into accepting. jfc
@hipsterelectron Kind of reminds me of "Proven in Use" defense in automotive, SIL-levels.
Because a certain design has worked for 50 years, it's taken to be correct. Something like that. Now we are down to a few weeks or a product ?
"The requirements of these schemes can be met either by establishing a rigorous development process, or by establishing that the device has sufficient operating history to argue that it has been proven in use."
A denotational semantics defines an appropriate mathematical space as a model for a language, and maps the language’s syntax into that space in a way that is compositional. This property requires that the semantics of a syntactic phrase be a function of the semantics of the phrase’s syntactic sub-components.
so "denotation semantics" is a made up interpretation that conforms to some fuckboy's idea of aesthetically pleasing. see i'm learning so much
it keeps going. now he's claiming to be the first to have invented the C abstract machine (operational semantics)
OMG
It is central to our thesis that the semantics of C is so complicated that it can only be usefully manipulated in the context of a theorem prover.
THE C STANDARD IS WRITTEN BY HUMANS? FOR HUMANS?
@hipsterelectron ... if *C* is that complicated, the higher level languages must all be just completely fucked, then, by this standard.
this is also pretty worrying because he dismissed earlier ever conforming with the C standard, and seL4 literally just asserts that its C code conforms to the model
like this was not some newfangled thing people started doing recently! people writing code that needs to validate nontrivial properties generally do it by actually getting their hands dirty and doing the work to link the compiler's internal semantics to the representation made in HOL or whatever
Without mechanical support, reasoning with a big semantics is error-prone, and it can be hard to be confident that one’s proofs are actually correct.
does he..........how does he think c compilers work
Using a theorem prover means that we are confident that all of the results we have proved are correct.
"correct"
Having used the theorem prover HOL [GM93], we are particularly confident, as this system, following the example of its ancestor system LCF [GMW79], uses the strong type system of ML to guarantee that values of type theorem are only produced in ways that are logically sound.
that's it. that's your persuasive essay???
he keeps mentioning like "yeah these theorems take a lot of effort to prove.......and often they're completely unusable too" like sir have you considered that things being difficult might indicate that you need to find a semantics engine that doesn't hate you
cambridge still batting 100% on being actively evil people who just write whatever they want on official letterhead
https://trustworthy.systems/publications/papers/Tuch%3Aphd.pdf
this one is hosted on the seL4 site
Systems impose on languages many abstraction breaking requirements
"systems" lmao
and are not usually considered amenable to implementation in higher-level languages like Java and ML.
yeah cause the JVM abstract machine is specifically built to be this fucked up carnival ride. you could do it if you specifically forked the JVM. hate this lack of precision from ppl who are so loud about "formalism"
For example, zero-copy I/O and address translation are crucial features
zero-copy IO and address translation are extremely different things. zero-copy IO doesn't even make sense in ring 0 and is not in fact a "crucial feature". it's not even a language feature!
and programmers demand the freedom to control data structure layout [87],
you can "control data structure layout" in any language that lets you address bytes which i think is literally all of them. C struct layout is actually rly annoying because you can't let the compiler help you at all
in particular when optimising the cache and TLB footprint that is typically opaque in such languages.
those aren't your data structures those are the CPU's and that's ring 0 again, not a language feature
Inside the research community there are recent promising efforts at harnessing the gains of the last three decades of programming language research [8, 22, 29, 37, 46, 68, 89],
guy who knows nothing about anything he just said: "i represent the 'research community' and we will exterminate your kind"
with an emphasis on types and static checking, when implementing systems.
this guy grew into the rust evangelism strike force
However, these advances are yet to be popularised in industry
guy who thinks "systems" are an industry-specific thing
and still face enormous scepticism from systems implementors who are highly obsessed
with efficiency, sometimes to the extreme where clock cycles are the metric of choice.
this fucking guy!!!!! clock cycles can actually be counted reliably lmao. THIS is what seL4 is standing behind
Even today, it is easy to violate the C type system by its cast mechanism and through address arithmetic.
guy who thinks C's type system is being violated through casting and address arithmetic. you know those have concrete semantics right
The programmer is given, intentionally, access to low-level bit and byte representations of values in memory.
again, that's literally every language
There are no checks on array bounds when indexing — this would violate C’s design philosophy.
the guy who is telling you with a straight face that he totally formalized C semantics for high-assurance ring 0 scenarios is now telling you he finds the language detestable
god it would be so cool if rust gave a shit about correctness
C does not have garbage collection and the programmer is responsible for allocation and deallocation of memory through library calls.
"library calls" why would you declare that you don't know the semantics at all
A systems implementor may even develop his or her own memory allocator that replaces this already low-level interface, enabling direct management of the physical memory in a system.
THIS IS THE GUY WHO IS CLAIMING HE KNOWS WHAT SEMANTICS ARE!
@hipsterelectron "Already low-level interface"
... What? Malloc (presumably the "Library Call" he means) is an abstraction (heap) playing with abstractions (Arenas, Blocks/Superblocks, threads, whatever), and often passing through other layers of abstraction like virtual memory...
Like, sure it's low-level compared to like... python, I guess?
But from the perspective of assembly & actual hardware interaces; most implementations of malloc() & co. are very high level...
Unfortunately, systems code is by no means strictly conforming and we could say by definition requires the ability to violate the standard’s strict rules on how memory can be accessed.
i am literally going to go find the C standard right now because the model of globally addressable memory space is i'm pretty sure the one thing that's not violated
like personally i think someone (not this guy) could make a pretty effective case for having correctly represented the semantics of C in ring 0 in a theorem prover even if they didn't link it to precise lines of C code through a model in the compiler,,,,
but if i was ever gonna say anything like "high-assurance" or "secure" i would actually do the work to link my semantic model to the one in the compiler and the CPU/RAM. and i would bully c standards people into accepting it
As a result, when describing type safety with respect to a C program in this thesis, we refer to a looser notion,
bruh. don't say things like that
where we may require expressions that designate a memory object to have a type corresponding to the expected value stored in memory.
he should have said "type" to clarify that that was gonna be the subject of debate. but this guy represents the "research community" so i bet he thinks his type is Correct
Program fragments can be type-safe if all their expressions have this property and later we formalise what is meant by the expected value’s type.
"type-safe". usually in cryptography we don't invoke generic informal terminology when we want people to take us seriously
Memory management code tracks the free memory that can be allocated and also sometimes the memory that has been allocated.
he just keeps going??????? here i'll translate:
- "the free memory that can be allocated": sometimes non-micro kernels like linux maintain free lists of unmapped physical pages so that moving the sbrk can be made very fast if not completely atomic
- "and also sometimes the memory that has been allocated": i suspect this is referring to a process's virtual address mapping, but maybe it's referring to an in-kernel allocator
This is commonly done through pointer-linked data structures,
why are we still saying "pointer" when we're in ring 0???? that's a physical address buddy
and this use of what are also called mutable inductively-defined data structures
no citation here is so disrespectful lmao
is the cause of a great degree of the difficulty in reasoning about such code formally.
i'm sorry you're having difficulty maybe it's time to give it up???
This difficulty, a direct consequence of the use of indirection,
how are you still negging the reader like this
can be broken down as the aliasing [14] and frame [61] problems.
oh my GOD!!!!! ok so these fucking citations my god
[14] this is literally about virtual memory conforming to the C standard https://eis.mdx.ac.uk/staffpages/r_bornat/papers/MPC2000.pdf
The final difficulty is the complexity of the proofs: not only do we have to reason formally about sets, sequences, graphs and trees, we
have to make sure that the locality of assignment operations is reflected in the treatment of assertions about the heap.
EVEN THAT PAPER'S AUTHOR IS TELLING HIM TO DO HIS FUCKING JOB LOL
For all of these reasons, Hoare logic isn’t widely used to verify pointer programs. Yet most low-level and all object-oriented programs use heap pointers freely. If we wish to prove properties of the kind of programs that actually get written and used, we shall have to deal with pointer programs on a regular basis.
This difficulty, a direct consequence of the use of indirection,
how are you still negging the reader like this
can be broken down as the aliasing [14] and frame [61] problems.
oh my GOD!!!!! ok so these fucking citations my god
[14] this is literally about virtual memory conforming to the C standard https://eis.mdx.ac.uk/staffpages/r_bornat/papers/MPC2000.pdf
The final difficulty is the complexity of the proofs: not only do we have to reason formally about sets, sequences, graphs and trees, we
have to make sure that the locality of assignment operations is reflected in the treatment of assertions about the heap.
EVEN THAT PAPER'S AUTHOR IS TELLING HIM TO DO HIS FUCKING JOB LOL
For all of these reasons, Hoare logic isn’t widely used to verify pointer programs. Yet most low-level and all object-oriented programs use heap pointers freely. If we wish to prove properties of the kind of programs that actually get written and used, we shall have to deal with pointer programs on a regular basis.
literally nothing will prepare you for [61]
literally nothing will prepare you for [61]
[61] McCarthy and P. Hayes. Some philosophical problems from the
standpoint of artificial intelligence. In D. Michie and B. Meltzer, editors,
Machine Intelligence 4, pages 463–502. Edinburgh University Press,
1969.
[61] McCarthy and P. Hayes. Some philosophical problems from the
standpoint of artificial intelligence. In D. Michie and B. Meltzer, editors,
Machine Intelligence 4, pages 463–502. Edinburgh University Press,
1969.
only possible alternative is he mistyped the reference address making a crucial point in his own phd thesis
[62] F. Mehta and T. Nipkow. Proving pointer programs in higher-order
logic. Information and Computation, 199(1-2):200–227, 2005.
and yes, it still assumes the heap. even though if you're managing physical memory. you do not have a heap
god fuck and even this example is literally impossible
For an example of aliasing, consider a program with two pointer variables
int * pandint * qand the following triple:{| True |} ∗p = 37 ; ∗q = 42 ; {| ∗p = ? |}
not only has he just said "triple" without a citation like that's a well-known thing, this is the problem with it:
We are unable to ascertain the value pointed to by
pas it may refer to the same location asq.
so you're telling me this C code:
int * p;
int * q;
*p = 37;
*q = 42;
demonstrates a classic aliasing problem....................does this guy even know about restrict or the concept of Undefined Behavior
turns out a "hoare triple" is the fuckboy term for "precondition and postcondition" when you think you're the first to ever create an abstract machine for the C programming language
he never cites anything about a Hoare triple either
Hoare triples, where a block of code is preceded by a pre-condition and followed by a post-condition, have already appeared in §1.1.1.
he did just refer you, with a hyperlinked section heading, to the line above where he just says "triple"
The aliasing problem is much worse for inductively-defined data structures, where it is possible that structural invariants can be violated, and where we need more sophisticated recursive predicates to stipulate aliasing conditions.
this problem is so bad he has no example
These predicates appear in specifications, invariants and proofs, and their discovery is often a time consuming trial-and-error process.
sorry that you hate your job?
The aliasing situation becomes untenable when code is type-unsafe and we are forced to seek improved methods.
if the C compiler can unambiguously interpret it then maybe the C compiler is a better proof framework than isabelle/HOL
If instead we had a variable
float * p:{| True |} ∗p = 3 .14 ; ∗q = 42 ; {| ∗p = ? |}then not only do we have to consider aliasing between pointers of different types, but also the potential for p to be pointing inside the encoding of
∗qand vice versa.
i don't know what language you're verifying but C has semantics here
We talk about this phenomenon as inter-type aliasing.
who's we
While specifications may mention some state that is affected by the intended behaviour of a program, it is hard to capture the state that is not changed.
literally a nonsense sentence
In the above example, a client verification that also dereferences a pointer
r, not mentioned in the specification, has no information on its value after execution of the code fragment.
meaningless
This limits reusability and hence scalability of verifications.
the C compiler is a better proof framework than isabelle/HOL
now he's explaining why he trusts machine output more than anyone
Formal reasoning with the usual process of pen-and-paper mathematical proof neither scales as we would like in software verification nor has the expected degree of rigour.
you know humans made isabelle/HOL too right
"scales" i think he literally just means deskilling here. he's literally saying actually formal verification is more reliable than engineers you pay money to
now he's trying to mansplain "algorithmic verification" as if model checking in TLA+ https://lamport.azurewebsites.net/tla/high-level-view.html isn't actually extremely widely used in this exact space already
Research in this area has produced impressive results in recent years with improvements in the underlying theory and increased available computing power.
theory and computing power are the only things that produce impressive results. no i can't give you a citation it's not like this is my dissertation
They can catch increasingly wider classes of programmer errors
this is still a fucking hateful way to talk about formal verification
and even guarantee the absence of certain types of bugs.
such a half-assed phd thesis. do you even believe in what you're saying
Proof creation has a high cost associated with it.
yeah you keep saying it sucks. aren't you supposed to fix things like that
To give an idea, it has been the author’s experience [93, 94, 96] that, in an interactive theorem prover, verifying the functional correctness of C code can require between one and two orders of magnitude more proof steps than line count,
(a) yeah maybe cause you keep trying to verify functional correctness guarantees about heap allocations in ring 0
(b) "line count" appeals to the VC class. it doesn't work on programmers
literally everyone go read aaron turon's paper on weak atomic memory orderings right now https://plv.mpi-sws.org/gps/paper.pdf yes there's a coq proof but that's not what a paper is for!!!
check out this future work section:
However, the C11 model allows programmers to freely mix memory orderings, and ideally program logics should support such mixed reasoning as well.
literally it's that easy to give a shit about making people safe and providing powerful robust guarantees. this is why rust used to be good
Early investigation suggests that the C11 model has some corner cases when mixing memory orderings that may obstruct compositional reasoning principles.
i get nerd sniped every time i read that line lmao