Discussion
Loading...

Post

Log in
  • About
  • Code of conduct
  • Privacy
  • Users
  • Instances
  • About Bonfire
Jan :rust: :ferris:
Jan :rust: :ferris:
@janriemer@floss.social  路  activity timestamp 2 weeks ago

In case you need objective arguments on why #LLM agents are unsuitable for deployment in enterprise settings, your argument should be what @Mer__edith calls "The Exponential Decay of Success" 馃搲

https://media.ccc.de/v/39c3-ai-agent-ai-spy#t=1629

You can't argue against math/physics!

Also highly recommend watching the whole talk, where Meredith Whittaker and Udbhav Tiwari present the increasing erosion of End-2-End encryption and #privacy via #OS-level AI agents.

#39C3 #E2EE #Society #Microsoft #Ethics #AI #AIAgent

2 media
A slide from the talk showing the following example calculation on the success rate of AI agents doing a task that consists of multiple steps:
Given a generous 95% per step accuracy, a 10-step task only has ~60% success rate, a 30-step task even only has a ~21% success rate.

The formula for calculation is: S(n) = r^n, where
- S(n) is the probability of success after n steps
- r is the reliability per step
- n is the integer number of steps (1, 2, 3, ...)
A slide from the talk showing the following example calculation on the success rate of AI agents doing a task that consists of multiple steps: Given a generous 95% per step accuracy, a 10-step task only has ~60% success rate, a 30-step task even only has a ~21% success rate. The formula for calculation is: S(n) = r^n, where - S(n) is the probability of success after n steps - r is the reliability per step - n is the integer number of steps (1, 2, 3, ...)
A slide from the talk showing the following example calculation on the success rate of AI agents doing a task that consists of multiple steps: Given a generous 95% per step accuracy, a 10-step task only has ~60% success rate, a 30-step task even only has a ~21% success rate. The formula for calculation is: S(n) = r^n, where - S(n) is the probability of success after n steps - r is the reliability per step - n is the integer number of steps (1, 2, 3, ...)
AI Agent, AI Spy
  • Copy link
  • Flag this post
  • Block
gittaca
gittaca
@gittaca@chaos.social replied  路  activity timestamp 2 weeks ago

@janriemer @Mer__edith Same for in multi-step chemical syntheses. People are astonished if you tell them that 80, 90% yield or so _per step_ is practically useless for many endeavours.

  • Copy link
  • Flag this comment
  • Block

bonfire.cafe

A space for Bonfire maintainers and contributors to communicate

bonfire.cafe: About 路 Code of conduct 路 Privacy 路 Users 路 Instances
Bonfire social 路 1.0.1-beta.35 no JS en
Automatic federation enabled
Log in
  • Explore
  • About
  • Members
  • Code of Conduct