Post · bonfire.cafe

Post

In case you need objective arguments on why #LLM agents are unsuitable for deployment in enterprise settings, your argument should be what @Mer__edith calls "The Exponential Decay of Success" 📉

https://media.ccc.de/v/39c3-ai-agent-ai-spy#t=1629

You can't argue against math/physics!

Also highly recommend watching the whole talk, where Meredith Whittaker and Udbhav Tiwari present the increasing erosion of End-2-End encryption and #privacy via #OS-level AI agents.

#39C3 #E2EE #Society #Microsoft #Ethics #AI #AIAgent

A slide from the talk showing the following example calculation on the success rate of AI agents doing a task that consists of multiple steps:
Given a generous 95% per step accuracy, a 10-step task only has ~60% success rate, a 30-step task even only has a ~21% success rate.

The formula for calculation is: S(n) = r^n, where
- S(n) is the probability of success after n steps
- r is the reliability per step
- n is the integer number of steps (1, 2, 3, ...) — A slide from the talk showing the following example calculation on the success rate of AI agents doing a task that consists of multiple steps: Given a generous 95% per step accuracy, a 10-step task only has ~60% success rate, a 30-step task even only has a ~21% success rate. The formula for calculation is: S(n) = r^n, where - S(n) is the probability of success after n steps - r is the reliability per step - n is the integer number of steps (1, 2, 3, ...)

gittaca

@gittaca@chaos.social replied · last month

@janriemer @Mer__edith Same for in multi-step chemical syntheses. People are astonished if you tell them that 80, 90% yield or so _per step_ is practically useless for many endeavours.

bonfire.cafe

A space for Bonfire maintainers and contributors to communicate

bonfire.cafe: About · Code of conduct · Privacy · Users · Instances

Bonfire social · 1.0.2-alpha.7 no JS en

Automatic federation enabled