Post · bonfire.cafe

Wow, thanks for taking the time to write out your experience so completely. I think I’d have a similar complex reaction.

Recently developed a large multi threaded python program implementing a complex PID controlled system with lots of realtime IO from sensors and encoders and to actuators along with weak test coverage. Kids interacted with it at a science museum.

1/n

Shrouded Scribe

@shroudedscribe@fosstodon.org · last month

@mttaggart Great write up. I agree that there's a lot of nuance. Those solely using AI as a means for dividing the population into "ethical" and "unethical" groups, while stating "AI is a bubble", are not going to create the change they want to see. Balanced takes that insist on using tools as safely (and, ideally, as ethically) as possible are how you cross the divide.

adingbatponder

@adingbatponder@fosstodon.org · last month

@mttaggart Great. Sounds so familiar. I just tried repeating a git process which I knew worked because I did it yesterday using Claude as a help. Today Claude was completely off target. I had to correct it on major things many times. Was a bit shocked but....shows that one can learn from using it ....how else could I have corrected it the second time?

evan

@IrrationalMethod@social.coop · last month

@mttaggart

Thanks for this post. I've been an extreme skeptic of LLMs but seeing increasingly promising results for agentic coding. I'm not sure I'm on board with using it as regular practice but increasingly seeing the need to experiment with it to better understand how it'll impact my job, and provide better informed opinions on it.

Glyph

@glyph@mastodon.social · last month

@mttaggart I have read only the self-flagellation so far and can I just say: oof.

my own co-skeptic feeling here is that I am deeply sympathetic to what you’re trying to do here and also I am furious with your employer (or maybe just the ecosystem more generally) effectively forcing you to take a bunch of risks with this

Taggart :ifin:

@mttaggart@infosec.exchange · last month

@glyph I guess I see the professional side of it this way. I could:

Quit, which harms everyone involved and solves nothing.
Say nothing, which harms anyone impacted by dangerous AI.
Do what I'm doing, and hope to mitigate harm.

The choice is clear, and I'd much rather that I be the one talking about AI security than a myopic booster of the tech.

Glyph

@glyph@mastodon.social · last month

@mttaggart oh yeah, for sure. and even given risks+externalities accounted for, this type of work (i.e. the investigation in the post itself) needs to get done. and it's not worth much if it doesn't get done by someone with your priors and methodological constraints, which is to say, someone who it will personally hurt. so, (unironically) thank you for your service here

Glyph

@glyph@mastodon.social · last month

@mttaggart I am still left wondering, per https://blog.glyph.im/2025/08/futzing-fraction.html , if overall you felt like your experience here mitigated my ongoing concern that despite "appearing to work" on small-scale tools like this, the larger risks still mean that it may be a net negative, even just straightforwardly to productivity, when deployed at scale

Deciphering Glyph ::
The Futzing Fraction

Deciphering Glyph, the blog of Glyph Lefkowitz.

⁂

More from

Glyph

Taggart :ifin:

@mttaggart@infosec.exchange · last month

@glyph I hope I was clear that I still find the technology's harms outweigh its benefits. That would be true even if it produced perfect code every time, and that simply isn't the case.

What I discovered here is that, in limited use cases, the probability of error can decrease significantly, and the real time investment to build a working and secure product diminishes. That said, a lot of things need to go right, and every single process to keep the model on track is prone to failure. Also, context (in the model's sense) really matters. This project was small enough that the requisite context was almost always available to the model, or it was primed with external sources to make it available. Deployed against a much larger codebase, you'd need proportionally more computing resources to do likewise, and again your probability for error increases.

So yeah, still not great. I found a way to make it work, but doing so sucked ass.

I also wasn't kidding about Rust as basically a requirement. I would never in a million years attempt this with Python—which I love, by the way. But even with live LSP linting, the average Python code quality in the model's training corpora is going to affect output, and without the compile-time checks of Rust, I'd be very worried about hidden dragons.

Taggart :ifin:

@mttaggart@infosec.exchange · last month

@glyph Oh, one other point. I think the FF model might need a corollary for coding agents. Per-inference calculations don't really make sense in this workflow. Instead it would be more beneficial to think about time/usage per feature or commit or something. And yeah, by those metrics, this was phenomenally faster than what I would have done myself, and thanks to careful scaffolding, solid on the other concerns as well. By the numbers, this application was an unequivocal win. Just, y'know, an icky one.

Glyph

@glyph@mastodon.social · last month

@mttaggart okay, read the whole thing now. I wouldn't have phrased the "purity" section at the end in quite the same way you did, but it didn't raise my hackles in quite the same way Doctorow did with the same point. "I am tired of running from one corner of technology to the next" resonated hard enough to rattle my teeth

Taggart :ifin:

@mttaggart@infosec.exchange · last month

@glyph I struggled with that section a lot, but I think it's demonstrably true that we spend more time tearing each other down than building each other up, and in so doing we give the victory to our adversaries.

thief_of_fire

@thief_of_fire@infosec.exchange · last month

@mttaggart my criteria for using llms for code generation at work:
1. Internal only tool
2. Doesn't involve new ideas, just involves implementing well known design patterns
3. Doesn't directly affect anything critical
4. I could do it, and have a detailed idea of how I would implement it
5. I have a good understanding of the necessary tests and edge cases that would verify the generated code
6. I don't have the time available to set aside for implementing it in the next 6 months

Alastair Temple

@smilingdemon@mastodon.art · last month

@mttaggart something I have been thinking recently, and which chimes a bit with your ultimate conclusion, is that I think of AI users a lot like smokers. 1/2

Alastair Temple

@smilingdemon@mastodon.art · last month

@mttaggart E.g. a) I think it is generally bad for their health (smoking literally, AI in terms of cognitive skills) in the long term, although some will get away with it. b) lots of people using them will be collectively bad for society as these costs compound. c) an individual using it doesn't make them a bad person (although I would encourage them not to). d) pushing (be that tobacco or AI) it on the other hand does demonstrate some sort of moral failing. 2/2

Taggart :ifin:

@mttaggart@infosec.exchange · last month

@smilingdemon That feels mostly correct, and the addictive properties align as well. I am wary of too-simple parallels, but this is close to a line of thinking I'm pursuing.

zaicurity

@zaicurity@infosec.exchange · last month

@mttaggart This fits what I’ve seen at $dayjob recently where talented and experienced people manage to sometimes get good use out of these tools (although with fewer ethical doubts than you describe). I’m mostly worried about problems caused by folks who don’t care or don’t know any better.
Successful use cases will also make it more difficult to argue against LLM use for those of us who don’t want to use them due to ethical reasons. I’m not looking forward to that.

Taggart :ifin:

@mttaggart@infosec.exchange · last month

@zaicurity Exactly. For the carelessness, I don't think a tool absolves one of carelessness, but I do think this tool in particular—at least in the way it is implemented now—makes carelessness not only easy, but highly incentivized. Without a dizzying array of external guardrails, harmful mistakes will occur. A bit more friction in the creation might go a long way. But alas, that would not be a popular product.

And yeah, people should have a right to opt out of using these things for ethical reasons, but I do think examining those objections closely is worthwhile, if only to strengthen them.

WinterKnight :donor:

@winterknight1337@infosec.exchange · last month

@mttaggart ah. That’s what the vaguetoot was about.

Taggart :ifin:

@mttaggart@infosec.exchange · last month

@winterknight1337 Yep. Bracing for the fallout.

WinterKnight :donor:

@winterknight1337@infosec.exchange · last month

@mttaggart good write up man.

Taggart :ifin:

@mttaggart@infosec.exchange · last month

@winterknight1337 Thanks, friend. Most appreciated.

Xavier Ashe :donor:

@Xavier@infosec.exchange · last month

@mttaggart Nice post. Yeah, the tipping point from these coding assistant creating slop, to being usable is a fairly recent thing. I'm not a coder, I'm a security engineer. So I'm used to handing over trust to a tool or SaaS service. Guardrails and layered controls are the key.

I think the skills we're learning right with how to make coding assistant write good code is a marketable skill. I feel like were back in the early days of the cloud learning a new skill that's cutting edge.

Anywho, just wanted to say I enjoyed reading your blog post. I too am struggling with all the complexities and externalities of AI.

Taggart :ifin:

@mttaggart@infosec.exchange · last month

@Xavier Thank you for reading, and for struggling!