Discussion
Loading...

#Tag

Log in
  • About
  • Code of conduct
  • Privacy
  • Users
  • Instances
  • About Bonfire
Mike Williamson
Mike Williamson
@sleepycat@infosec.exchange  ·  activity timestamp 2 weeks ago

Meta: practical #AIagent #security and the Agents Rule of Two

https://ai.meta.com/blog/practical-ai-agent-security/

  • Copy link
  • Flag this post
  • Block
Jan :rust: :ferris:
Jan :rust: :ferris:
@janriemer@floss.social  ·  activity timestamp 2 weeks ago

In case you need objective arguments on why #LLM agents are unsuitable for deployment in enterprise settings, your argument should be what @Mer__edith calls "The Exponential Decay of Success" 📉

https://media.ccc.de/v/39c3-ai-agent-ai-spy#t=1629

You can't argue against math/physics!

Also highly recommend watching the whole talk, where Meredith Whittaker and Udbhav Tiwari present the increasing erosion of End-2-End encryption and #privacy via #OS-level AI agents.

#39C3 #E2EE #Society #Microsoft #Ethics #AI #AIAgent

2 media
A slide from the talk showing the following example calculation on the success rate of AI agents doing a task that consists of multiple steps:
Given a generous 95% per step accuracy, a 10-step task only has ~60% success rate, a 30-step task even only has a ~21% success rate.

The formula for calculation is: S(n) = r^n, where
- S(n) is the probability of success after n steps
- r is the reliability per step
- n is the integer number of steps (1, 2, 3, ...)
A slide from the talk showing the following example calculation on the success rate of AI agents doing a task that consists of multiple steps: Given a generous 95% per step accuracy, a 10-step task only has ~60% success rate, a 30-step task even only has a ~21% success rate. The formula for calculation is: S(n) = r^n, where - S(n) is the probability of success after n steps - r is the reliability per step - n is the integer number of steps (1, 2, 3, ...)
A slide from the talk showing the following example calculation on the success rate of AI agents doing a task that consists of multiple steps: Given a generous 95% per step accuracy, a 10-step task only has ~60% success rate, a 30-step task even only has a ~21% success rate. The formula for calculation is: S(n) = r^n, where - S(n) is the probability of success after n steps - r is the reliability per step - n is the integer number of steps (1, 2, 3, ...)
AI Agent, AI Spy
  • Copy link
  • Flag this post
  • Block
maco and 1 other boosted
jbz
jbz
@jbz@indieweb.social  ·  activity timestamp last month

👍 Google's AI Deletes User's Entire Hard Drive, Issues Groveling Apology: "I Cannot Express How Sorry I Am"

https://futurism.com/artificial-intelligence/google-ai-deletes-entire-drive

#ai #aiagent #antigravity

  • Copy link
  • Flag this post
  • Block
jbz
jbz
@jbz@indieweb.social  ·  activity timestamp last month

👍 Google's AI Deletes User's Entire Hard Drive, Issues Groveling Apology: "I Cannot Express How Sorry I Am"

https://futurism.com/artificial-intelligence/google-ai-deletes-entire-drive

#ai #aiagent #antigravity

  • Copy link
  • Flag this post
  • Block
Roni Rolle Laukkarinen
Roni Rolle Laukkarinen
@rolle@mementomori.social  ·  activity timestamp 6 months ago

Left: Qwen Code
Right: Claude Code

Open source agents are really improving. Qwen Code 0.0.1-alpha.10 on GitHub: https://github.com/QwenLM/qwen-code

#AI#AIAgent#Claude#ClaudeCode#Qwen#QwenCode#QwenCoder#AlibabaCloud#OpenSource

Screenshot of two AI coder agents on Linux terminal, Qwen Code on the left, Clauce Code on the right, both solving the same problem with the same solution
Screenshot of two AI coder agents on Linux terminal, Qwen Code on the left, Clauce Code on the right, both solving the same problem with the same solution
Screenshot of two AI coder agents on Linux terminal, Qwen Code on the left, Clauce Code on the right, both solving the same problem with the same solution
  • Copy link
  • Flag this post
  • Block

bonfire.cafe

A space for Bonfire maintainers and contributors to communicate

bonfire.cafe: About · Code of conduct · Privacy · Users · Instances
Bonfire social · 1.0.1-beta.35 no JS en
Automatic federation enabled
Log in
  • Explore
  • About
  • Members
  • Code of Conduct