Discussion
Loading...

#Tag

Log in
  • About
  • Code of conduct
  • Privacy
  • Users
  • Instances
  • About Bonfire
Hacker News
Hacker News
@h4ckernews@mastodon.social  ·  activity timestamp 2 weeks ago

Provably Unmasking Malicious Behavior Through Execution Traces

https://arxiv.org/abs/2512.13821

#HackerNews #Provably #Unmasking #Malicious #Behavior #Through #Execution #Traces #executiontraces #cybersecurity #research #arxiv #maliciousbehavior #hackernews

arXiv.org

The Double Life of Code World Models: Provably Unmasking Malicious Behavior Through Execution Traces

Large language models (LLMs) increasingly generate code with minimal human oversight, raising critical concerns about backdoor injection and malicious behavior. We present Cross-Trace Verification Protocol (CTVP), a novel AI control framework that verifies untrusted code-generating models through semantic orbit analysis. Rather than directly executing potentially malicious code, CTVP leverages the model's own predictions of execution traces across semantically equivalent program transformations. By analyzing consistency patterns in these predicted traces, we detect behavioral anomalies indicative of backdoors. Our approach introduces the Adversarial Robustness Quotient (ARQ), which quantifies the computational cost of verification relative to baseline generation, demonstrating exponential growth with orbit size. Theoretical analysis establishes information-theoretic bounds showing non-gamifiability -- adversaries cannot improve through training due to fundamental space complexity constraints. This work demonstrates that semantic orbit analysis provides a scalable, theoretically grounded approach to AI control for code generation tasks.
  • Copy link
  • Flag this post
  • Block
Dendrobatus Azureus
Dendrobatus Azureus
@Dendrobatus_Azureus@mastodon.bsd.cafe  ·  activity timestamp 3 weeks ago

I'm starting to detect the breakdown of what Google has done when it comes down to authentication of programmers for its Android platform

I needed to patch a very small program through fDroid KDE connect
When I proceeded with install, Google Play popped up a bogus requester stating that the program is malicious.

It's not malicious it's an Open Source program

I proceeded with install anyway for which I had to put in my phone's password. Google Play decided to block me from updating the program. I proceeded with this 16 times and got the same result.

I have to check what happens on my other Androids, however on this Android, a small one, I cannot update kdeconnect anymore, because the Play Store actively blocks it even when I override it with install anyway.

KDE connect was installed via fDroid

I have the right to install any program I want, on my device.

I paid for it!

Google cannot dictate me what to do

I shall fight this

@kde @stefano @vermaden

#OpenSource #programming #technology #Google #play #bully #war #fDroid #KDE #network #malicious #bogus

  • Copy link
  • Flag this post
  • Block
Hacker News
Hacker News
@h4ckernews@mastodon.social  ·  activity timestamp 3 months ago

NPM flooded with malicious packages downloaded more than 86k times

https://arstechnica.com/security/2025/10/npm-flooded-with-malicious-packages-downloaded-more-than-86000-times/

#HackerNews #NPM #malicious #packages #security #vulnerabilities #cyber #threats #software #development

Ars Technica

NPM flooded with malicious packages downloaded more than 86,000 times

Packages downloaded from NPM can fetch dependancies from untrusted sites.
  • Copy link
  • Flag this post
  • Block
Dendrobatus Azureus
Dendrobatus Azureus
@Dendrobatus_Azureus@mastodon.bsd.cafe  ·  activity timestamp 3 months ago

It seems to be quite convenient that google flags immich.app site as dangerous, since immich is an environment in which you can host your own photographs in a safe manner without Google.

#Immich #app #self #hosting #technology #OpenSource #programming #Linux #photographs #Google #Malicious

https://immich.app/blog/google-flags-immich-as-dangerous

Sorry, no caption provided by author
Sorry, no caption provided by author
Sorry, no caption provided by author
  • Copy link
  • Flag this post
  • Block
Michael Downey 🧢 boosted
h o ʍ l e t t
h o ʍ l e t t
@homlett@mamot.fr  ·  activity timestamp 5 months ago

→ We Are Still Unable to Secure LLMs from #Malicious Inputs
https://www.schneier.com/blog/archives/2025/08/we-are-still-unable-to-secure-llms-from-malicious-inputs.html

“This kind of thing should make everybody stop and really think before deploying any AI agents. We simply don’t know to defend against these attacks. We have zero agentic AI systems that are secure against these attacks.”

“It’s an existential problem that, near as I can tell, most people developing these technologies are just pretending isn’t there.”

#AI#LLMs #stop #agents #secure #attacks #problem

  • Copy link
  • Flag this post
  • Block
h o ʍ l e t t
h o ʍ l e t t
@homlett@mamot.fr  ·  activity timestamp 5 months ago

→ We Are Still Unable to Secure LLMs from #Malicious Inputs
https://www.schneier.com/blog/archives/2025/08/we-are-still-unable-to-secure-llms-from-malicious-inputs.html

“This kind of thing should make everybody stop and really think before deploying any AI agents. We simply don’t know to defend against these attacks. We have zero agentic AI systems that are secure against these attacks.”

“It’s an existential problem that, near as I can tell, most people developing these technologies are just pretending isn’t there.”

#AI#LLMs #stop #agents #secure #attacks #problem

  • Copy link
  • Flag this post
  • Block
Tyng-Ruey Chuang boosted
Gea-Suan Lin
Gea-Suan Lin
@gslin@abpe.org  ·  activity timestamp 6 months ago
https://blog.gslin.org/archives/2025/08/16/12575/stardict-%e9%a0%90%e8%a8%ad%e6%9c%83%e5%b0%87%e5%89%aa%e8%b2%bc%e7%b0%bf%e7%9a%84%e5%85%a7%e5%ae%b9%e9%80%8f%e9%81%8e-http-%e4%b8%8d%e6%98%af-https-%e5%82%b3%e5%88%b0%e4%b8%ad%e5%9c%8b%e7%9a%84/

StarDict 預設會將剪貼簿的內容透過 HTTP (不是 HTTPS) 傳到中國的伺服器上

#china #chinese #clipboard #http #https #malicious #privacy #security #stardict

  • Copy link
  • Flag this post
  • Block
Gea-Suan Lin
Gea-Suan Lin
@gslin@abpe.org  ·  activity timestamp 6 months ago
https://blog.gslin.org/archives/2025/08/16/12575/stardict-%e9%a0%90%e8%a8%ad%e6%9c%83%e5%b0%87%e5%89%aa%e8%b2%bc%e7%b0%bf%e7%9a%84%e5%85%a7%e5%ae%b9%e9%80%8f%e9%81%8e-http-%e4%b8%8d%e6%98%af-https-%e5%82%b3%e5%88%b0%e4%b8%ad%e5%9c%8b%e7%9a%84/

StarDict 預設會將剪貼簿的內容透過 HTTP (不是 HTTPS) 傳到中國的伺服器上

#china #chinese #clipboard #http #https #malicious #privacy #security #stardict

  • Copy link
  • Flag this post
  • Block

bonfire.cafe

A space for Bonfire maintainers and contributors to communicate

bonfire.cafe: About · Code of conduct · Privacy · Users · Instances
Bonfire social · 1.0.2-alpha.7 no JS en
Automatic federation enabled
Log in
  • Explore
  • About
  • Members
  • Code of Conduct