Discussion · bonfire.cafe

@davidgerard@circumstances.run · 11 hours ago

The key weakness in AI agents is that they're a lie. They don't work. They just don't fuckin' work. You can't set a hallucination engine to work doing tasks. It's pants on head stupid. The hype pretends this isn't the case and hypothesises a fabulous future where they work *at all*. This is a lie.

A useful model for "AI agents" is that they're the current excuse meme for AI. They're not a thing that works at all, now or in the fabulous future. But they're *such* good material for hypecrafting. No sausage at all, but *my god* that sizzle.

Marius (windsheep) :donor:

@windsheep@infosec.exchange replied · 6 hours ago

@davidgerard There are things AI cannot do today, but many simple software tasks can be automated.
OpenAI however doesn’t work. We know that.

nonlinear

@nonlinear@social.praxis.nyc replied · 6 hours ago

@davidgerard we live in "any day now" news. it's the news for future news, hopefully, invest on us.

Baltergeist

@Cotopaxi@mstdn.social replied · 6 hours ago

@davidgerard
"Pants on head stupid" is my hero descriptive of the month. 🫶

Todd Knarr

@tknarr@mstdn.social replied · 7 hours ago

@davidgerard The problem is they sort-of work, just well enough and just often enough to convince people they'll work all the time. Then the moment you trust them, they go "ebola-contaminated diarrhea in pants on head"-stupid. By then it's too late, backing out isn't an option, so everyone who proposed them has to pretend this wasn't normal so they don't look like complete idiots.

Cogito ergo mecagoendios

@elrohir@mastodon.gal replied · 7 hours ago

@davidgerard They do this thing where they cite a percentage of success, such as "we got 60% questions right at SAT". Implicitly they are tricking your mind into thinking there is some sort of a progress bar slowly loading. That the wrinkles are about to be ironed-out soon™.

They are not.

Cogito ergo mecagoendios

@elrohir@mastodon.gal replied · 7 hours ago

They are not.

Mark Harbinger

@Mark_Harbinger@mastodon.social replied · 7 hours ago

@davidgerard

The sound you hear won't be champagne corks. So...what do you plan to wear for the next, worldwide #GreatDePrAIssion ?

atlovato

@atlovato@mastodon.social replied · 8 hours ago

@davidgerard - Id love to see AI crash.

Jer

@Jer@chirp.enworld.org replied · 8 hours ago

@davidgerard The biggest asset that AI agents have is that they're a great excuse for why companies are laying off thousands of workers. It's not mismanagement or overhiring or a maniac in charge tanking the US economy - it's AI that can do your job better than you and faster that you! It's your fault that you aren't as good as the AI agent that replaces you!

Turns an economic crisis into a morality play to blame the workers. Perfect for the guys in charge right now who are bad at their jobs.

Rodrigo Dias

@rgo@masto.pt replied · 8 hours ago

@davidgerard Tried chaining LLMs for tasks—ends up in loops or wrong outputs. Better for ideation than execution.

Peter

@peter@thepit.social replied · 8 hours ago

@davidgerard using agents to solve "hallucinations" is like if you have a pot of soup with a turd in it, so your solution is to add more broth. your chances of getting the turd go down the more broth (iterations) you add, success!! (also, the soup is now the size of a swimming pool and costs $4,000.)

Revenant

@Revenant@hear-me.social replied · 8 hours ago

@davidgerard well, yeah.

Dibs

@dtwx@mastodon.social replied · 9 hours ago

@davidgerard I have multiple examples of Copilot failing to provide accurate information about MS products, which, apparently, Copilot can configure FOR you!

How's that possible?

gigantos

@gigantos@social.linux.pizza replied · 9 hours ago

@davidgerard I don’t know what you base this on. It most definitely work for some things, at least some of the time.

For example, someone I know needed to do some stuff with an Arduino to make it show a pretty wave pattern using unevenly distributed led lights. This person was a crafter, not a coder. However, by oploading a hand drawn picture of where the leds where placed it generated a web based simulator with sliders to tweak parameters. Then, when he was happy with the result after tuning the sliders, it generated code for the arduino that compiled and ran perfectly. First try.

It might have been an incredible amount of luck, but this non-technical person got his art project to work without needing to learn anything about code.

Avner

@Avner@anticapitalist.party replied · 8 hours ago

@gigantos @davidgerard "works for some things, at least some of the times" is NOT the way these LLM tools are being pitched. I think there would be much less of a backlash if OpenAI and co were like "hey, here's an occasionally useful tool for generating text and here are the use cases it's actually good at," rather than "fire all your employees and replace them with AI, who cares if it's fit for purpose!"

DistroWatch

@distrowatch@mastodon.social replied · 9 hours ago

@davidgerard "You can't set a hallucination engine to work doing tasks."

You can if your goal is to produce a lot of material that is not correct, or it doesn't matter if the material is correct.

I think that is what people tend to miss about the drive to get AI into the world. The people pushing it don't care if it's accurate, it might even be better for them if it's not, they just want a lot of material that looks passable to some people. They want filler and propaganda and misinformation.

DistroWatch

@distrowatch@mastodon.social replied · 9 hours ago

@davidgerard So when you say "they don't work", keep in mind that AI _does_ work as intended. AI agents just aren't very useful for most people. Those statements are contradictions, they're a sign of for whom AI works.

mossman

@mossman@social.vivaldi.net replied · 10 hours ago

@davidgerard that reminds me... I have to do my company's mandatory agentic AI training course this week... wish me luck 🤮

Skjeggtroll

@skjeggtroll@mastodon.online replied · 10 hours ago

@davidgerard

Software Agents never made much sense. In order for people to trust them to act on their behalf the task they do have to be so well defined that for all practical purposes it'd better to just automate it.

"AI Agents" make even less sense. Has anyone even suggested one that's more than just an automation wrapper around a sequence of LLM calls and service APIs?

David Chisnall (*Now with 50% more sarcasm!*)

@david_chisnall@infosec.exchange replied · 11 hours ago

@davidgerard

The use case for 'agents' isn't that they do useful things unattended, it's that they can consume (billed for) tokens unattended.

Artemis

@art_codesmith@toot.cafe replied · 11 hours ago

@davidgerard Maybe I'm misunderstanding something, but for what I understood, it's basically the same LLM stuff but in the background?
Basically, if you roll the dice enough times, you might get something that passes all the unit tests?
(And burn a whole bunch of tokens in the process...)

David Gerard

@davidgerard@circumstances.run replied · 10 hours ago

@art_codesmith you say "do a thing" and it goes and does the thing! Or what it hallucinates as the thing. This turns out to have a disastrously high failure rate. Also, it's hilariously easy to prompt-inject.

Welbog

@welbog@mstdn.ca replied · 8 hours ago

@davidgerard @art_codesmith It seems to me that we've built a system that has a chance of getting the right answer, but we've given up on finding ways to improve the chances of it getting the right answer. Instead we've wrapped the system in a loop, to check if it has the right answer after each iteration.

It's Bogosort.

The greatest achievement of humanity, worth boiling the oceans for, is Bogosort.

Jonathan Hendry

@jonhendry@iosdev.space replied · 11 hours ago

@davidgerard

People who were losing patience are like “ah, agents, now we’ll surely get what we were promised!” And then it takes a few months or a year for them to figure out, nope it still doesn’t work. By which time the AI grifters will have another silver bullet to pitch.

Androcat

@androcat@toot.cat replied · 11 hours ago

bonfire.cafe

A space for Bonfire maintainers and contributors to communicate

bonfire.cafe: About · Code of conduct · Privacy · Users · Instances

Bonfire social · 1.0.0 no JS en

Automatic federation enabled