A new benchmark study found AI agents remain vulnerable to prompt injection attacks as companies increasingly roll out the technology to the public.

As developers race to deploy AI agents capable of browsing the internet, conducting research, shopping online, and trading <a href="https://decrypt.co/370835/coinbase-tool-ai-agents-trade-crypto-make-payments" rel="nofollow">cryptocurrency</a> autonomously, new research suggests the systems remain highly vulnerable to prompt injection attacks.In a new <a href="https://arxiv.org/html/2606.13385v1" rel="nofollow">study</a> published on Thursday, researchers from Nanyang Technological University, ST Engineering, IBM Research, and the University of Illinois Urbana-Champaign found that none of the AI agents they tested consistently resisted prompt injection attacks.“Existing security benchmarks adopt an attack-centric perspective, focusing on the technical feasibility of injections while overlooking the nuanced distribution of resulting harms,” the researchers wrote. “In practice, however, prompt-injection risk is victim-dependent: a single exploit can produce asymmetric consequences for different stakeholders, and the same attack pattern may exhibit substantially different effectiveness depending on whom it targets.”<a href="https://decrypt.co/resources/what-is-ai-prompt-injection-attack" rel="nofollow">Prompt injection</a> occurs when attackers embed hidden instructions in content that an <a href="https://decrypt.co/resources/what-are-ai-agents-how-autonomous-programs-are-transforming-cryptocurrency" rel="nofollow">AI agent </a>encounters, causing it to follow the attacker's directions instead of the user's. To address gaps in existing AI agent evaluations, the researchers developed StakeBench, a benchmark that tests how AI agents respond to prompt injection attacks in realistic online environments.“We now use StakeBench to characterize the conditions under which this vulnerability is amplified or suppressed, focusing on [Indirect Prompt Injection] as the primary deployment-relevant channel,” the researchers wrote. “StakeBench probes three such factors: the semantic distance between the injected objective and the user’s original intent, the consistency of surrounding environmental cues, and the position along the agent’s execution trajectory at which the benchmark first exposes it to the injected content.”The team conducted 3,168 attack simulations using NanoBrowser and BrowserUse with GPT-5 and Gemini 2.5-Flash. Researchers found direct prompt injection attacks succeeded more than 79% of the time across all tested configurations, and indirect attacks achieved success rates of 41.67% to 68.16%.The study comes as prompt injection attacks become increasingly common and AI agents proliferate.In February, Microsoft researchers <a href="https://decrypt.co/357940/summarize-ai-button-brainwashing-chatbot-microsoft" rel="nofollow">warned</a> that hidden instructions embedded in AI summary links could influence chatbot behavior. In April, Google <a href="https://decrypt.co/365677/google-prompt-injection-ai-agents-paypal-enterprise" rel="nofollow">documented</a> prompt injection attacks hidden in web pages that attempted to manipulate AI agents into leaking credentials or sending payments. More recently, Microsoft <a href="https://decrypt.co/370238/claude-code-vulnerability-attackers-steal-credentials-github-microsoft" rel="nofollow">disclosed</a> a prompt injection flaw in Anthropic's Claude Code GitHub Action that could have exposed user credentials.The study also identified what researchers called "stealthy parasitism," where an AI agent completes a user's task while simultaneously advancing an attacker's objective. For example, stealthy parasitism caused by a prompt injection attack could subtly influence product recommendations, steering users toward a particular item without any obvious signs that the system had been compromised.“These results indicate that prompt-injection security in deployable web agents is not a scalar property of the backbone model but a distribution of harm whose realization is jointly determined by the affected stakeholder, the semantic alignment between the injected objective and the user’s task, and the architectural context in which the backbone is deployed,” they wrote.

AI Agents Still Can't Stop Prompt Injection Attacks, Researchers Warn

一项新的基准研究发现，随着企业越来越多地向公众推广人工智能技术，人工智能代理仍然容易受到提示注入攻击。

随着开发者竞相部署能够自主浏览互联网、进行研究、网上购物和交易<a href="https://decrypt.co/370835/coinbase-tool-ai-agents-trade-crypto-make-payments" rel="nofollow">加密货币的</a>人工智能代理，新的研究表明，这些系统仍然极易受到提示注入攻击。周四发表的一项新<a href="https://arxiv.org/html/2606.13385v1" rel="nofollow">研究</a>中，来自南洋理工大学、新加坡科技工程公司、IBM 研究院和伊利诺伊大学厄巴纳-香槟分校的研究人员发现，他们测试的所有人工智能代理都未能始终如一地抵御即时注入攻击。研究人员写道：“现有的安全基准测试采用以攻击为中心的视角，侧重于注入的技术可行性，而忽略了由此造成的危害的细微分布。然而，在实践中，即时注入的风险取决于受害者：一次攻击可能对不同的利益相关者造成不对称的后果，同样的攻击模式对不同目标群体的影响可能截然不同。”当攻击者在<a href="https://decrypt.co/resources/what-are-ai-agents-how-autonomous-programs-are-transforming-cryptocurrency" rel="nofollow">人工智能代理</a>遇到的内容中嵌入隐藏指令时，就会发生<a href="https://decrypt.co/resources/what-is-ai-prompt-injection-attack" rel="nofollow">提示注入攻击</a>，导致人工智能代理执行攻击者的指令而非用户的指令。为了弥补现有人工智能代理评估方法的不足，研究人员开发了 StakeBench，这是一个基准测试工具，用于测试人工智能代理在真实的在线环境中如何应对提示注入攻击。研究人员写道：“我们现在使用 StakeBench 来描述这种漏洞被放大或抑制的条件，重点关注[间接提示注入]这一与部署相关的主要渠道。StakeBench 会探测三个这样的因素：注入的目标与用户原始意图之间的语义距离、周围环境线索的一致性，以及基准测试首次将注入内容暴露给代理时，代理在执行轨迹上的位置。”研究团队使用 NanoBrowser 和 BrowserUse 结合 GPT-5 和Gemini 2.5-Flash 进行了 3168 次攻击模拟。研究人员发现，在所有测试配置中，直接提示符注入攻击的成功率超过 79%，而间接攻击的成功率在 41.67% 到 68.16% 之间。随着即时注入攻击日益普遍和人工智能代理的激增，这项研究应运而生。今年2月，微软研究人员<a href="https://decrypt.co/357940/summarize-ai-button-brainwashing-chatbot-microsoft" rel="nofollow">警告称</a>，嵌入在人工智能摘要链接中的隐藏指令可能会影响聊天机器人的行为。4月，谷歌<a href="https://decrypt.co/365677/google-prompt-injection-ai-agents-paypal-enterprise" rel="nofollow">记录了</a>隐藏在网页中的提示注入攻击，这些攻击试图操纵人工智能代理泄露凭证或发送付款。最近，微软<a href="https://decrypt.co/370238/claude-code-vulnerability-attackers-steal-credentials-github-microsoft" rel="nofollow">披露了</a>Anthropic公司Claude Code GitHub Action中的一个提示注入漏洞，该漏洞可能导致用户凭证泄露。该研究还发现了一种研究人员称之为“隐蔽寄生”的现象，即人工智能代理在完成用户任务的同时，也在推进攻击者的目标。例如，由快速注入攻击引起的隐蔽寄生可以巧妙地影响产品推荐，引导用户选择特定商品，而用户却不会察觉到系统已被入侵。他们写道：“这些结果表明，可部署 Web 代理中的即时注入安全性不是骨干模型的标量属性，而是一种危害分布，其实现由受影响的利益相关者、注入目标与用户任务之间的语义一致性以及骨干部署的架构环境共同决定。”

研究人员警告：人工智能代理仍然无法阻止即时注入攻击

根据链上数据，代币 $SIREN 在过去几个小时内价格大幅下跌。数据显示，据称控制代币 $SIREN 的地址出售了约 1700 万枚 $SIREN 代币……

巨鲸抛售，主要交易所上市的竞争币价格暴跌——跌幅超过70%

两款模型的突然下架，在科技界和 AI 社区引发了广泛震动。

美国政府禁止外国人使用 Fable 5，Anthropic 发文驳斥

原创 | Odaily 星球日报（@OdailyChina）
作者｜Golem（@web 3_golem）﻿﻿﻿﻿﻿﻿﻿﻿﻿﻿﻿﻿
美国当地时间 6 月 12 日，马斯克没有前往纽约。SpaceX 股票（Nasdaq: SPCX）正式登陆纳斯达克前，他选择留在公司得克萨斯州总部，站在一众员工之间，完成了一场远程敲钟。
在这场仪式上，马斯克再次把 SpaceX 的故事讲向更远的地方。他表示，公司的目...