Discussion
Loading...

Post

Log in
  • About
  • Code of conduct
  • Privacy
  • Users
  • Instances
  • About Bonfire
mhoye
mhoye
@mhoye@cosocial.ca  ·  activity timestamp 11 hours ago

Being on Team Words Mean Things is difficult these days, particularly when multibillion-dollar companies put out breathless press releases saying "By using our massive language model, whose training data includes every version of GCC ever released, and having it autocorrect its own output by testing it against GCC, we managed to make a C compiler that mostly works for only $20,000 in a week and gosh I have so many feelings."

I mean, what the fuck are we even doing here.

https://www.anthropic.com/engineering/building-c-compiler

The fix was to use GCC as an online known-good compiler oracle to compare against. I wrote a new test harness that randomly compiled most of the kernel using GCC, and only the remaining files with Claude's C Compiler. If the kernel worked, then the problem wasn’t in Claude’s subset of the files. If it broke, then it could further refine by re-compiling some of these files with GCC. This let each agent work in parallel, fixing different bugs in different files, until Claude's compiler could eventually compile all files. (After this worked, it was still necessary to apply delta debugging techniques to find pairs of files that failed together but worked independently.)
The fix was to use GCC as an online known-good compiler oracle to compare against. I wrote a new test harness that randomly compiled most of the kernel using GCC, and only the remaining files with Claude's C Compiler. If the kernel worked, then the problem wasn’t in Claude’s subset of the files. If it broke, then it could further refine by re-compiling some of these files with GCC. This let each agent work in parallel, fixing different bugs in different files, until Claude's compiler could eventually compile all files. (After this worked, it was still necessary to apply delta debugging techniques to find pairs of files that failed together but worked independently.)
The fix was to use GCC as an online known-good compiler oracle to compare against. I wrote a new test harness that randomly compiled most of the kernel using GCC, and only the remaining files with Claude's C Compiler. If the kernel worked, then the problem wasn’t in Claude’s subset of the files. If it broke, then it could further refine by re-compiling some of these files with GCC. This let each agent work in parallel, fixing different bugs in different files, until Claude's compiler could eventually compile all files. (After this worked, it was still necessary to apply delta debugging techniques to find pairs of files that failed together but worked independently.)

Building a C compiler with a team of parallel Claudes

Anthropic is an AI safety and research company that's working to build reliable, interpretable, and steerable AI systems.
  • Copy link
  • Flag this post
  • Block
Jenniferplusplus
Jenniferplusplus
@jenniferplusplus@hachyderm.io replied  ·  activity timestamp 6 hours ago

@mhoye actually, I think it's worse than that? They didn't even build gcc. They built a linux kernel compiler. Running it on any other source would continue to reveal ways in which it is unlike gcc.

  • Copy link
  • Flag this comment
  • Block
Federico Mena Quintero
Federico Mena Quintero
@federicomena@mstdn.mx replied  ·  activity timestamp 10 hours ago

@mhoye I'm re-reading "Do Artifacts Have Politics?" and nodding so hard that my neck is starting to hurt.

"At issue is the claim that the machines, structures, and systems of modern material culture can be accurately judged not only for their contributions of efficiency and productivity, not merely for their positive and negative environmental side effects, but also for the ways in which they can embody specific forms of power and authority."

That's only the opening paragraph.

  • Copy link
  • Flag this comment
  • Block
mhoye
mhoye
@mhoye@cosocial.ca replied  ·  activity timestamp 11 hours ago

Great news everyone thanks to significant advances in modern algorithmic analysis I am personally able to outperform a warehouse full of specialized GPUs by five orders of magnitude with a single ARM core for one one-millionth the cost in 0.1% of the time by training the "cp" command on only the GCC source and then compiling the output of that program with GCC.

The resulting compiler - which I'm calling "mhoyecc", or as I've taken to calling it, mhoye plus cc, passes 100% of GCC's tests.

  • Copy link
  • Flag this comment
  • Block
Gabriele Svelto
Gabriele Svelto
@gabrielesvelto@mas.to replied  ·  activity timestamp 11 hours ago

@mhoye bonus points if you pipe stuff into sed 's/gcc/mhoyecc/g' for just a small amount of extra power

  • Copy link
  • Flag this comment
  • Block
mhoye
mhoye
@mhoye@cosocial.ca replied  ·  activity timestamp 11 hours ago

@gabrielesvelto Whoa man, that's crazy talk. I can't just pipe a codebase through _sed_ and pretend it's mine, I'm not that waterfox guy.

  • Copy link
  • Flag this comment
  • Block

bonfire.cafe

A space for Bonfire maintainers and contributors to communicate

bonfire.cafe: About · Code of conduct · Privacy · Users · Instances
Bonfire social · 1.0.2-alpha.7 no JS en
Automatic federation enabled
Log in
  • Explore
  • About
  • Members
  • Code of Conduct