A new benchmark study found AI agents remain vulnerable to prompt injection attacks as companies increasingly roll out the technology to the public.

As developers race to deploy AI agents capable of browsing the internet, conducting research, shopping online, and trading <a href="https://decrypt.co/370835/coinbase-tool-ai-agents-trade-crypto-make-payments" rel="nofollow">cryptocurrency</a> autonomously, new research suggests the systems remain highly vulnerable to prompt injection attacks.In a new <a href="https://arxiv.org/html/2606.13385v1" rel="nofollow">study</a> published on Thursday, researchers from Nanyang Technological University, ST Engineering, IBM Research, and the University of Illinois Urbana-Champaign found that none of the AI agents they tested consistently resisted prompt injection attacks.“Existing security benchmarks adopt an attack-centric perspective, focusing on the technical feasibility of injections while overlooking the nuanced distribution of resulting harms,” the researchers wrote. “In practice, however, prompt-injection risk is victim-dependent: a single exploit can produce asymmetric consequences for different stakeholders, and the same attack pattern may exhibit substantially different effectiveness depending on whom it targets.”<a href="https://decrypt.co/resources/what-is-ai-prompt-injection-attack" rel="nofollow">Prompt injection</a> occurs when attackers embed hidden instructions in content that an <a href="https://decrypt.co/resources/what-are-ai-agents-how-autonomous-programs-are-transforming-cryptocurrency" rel="nofollow">AI agent </a>encounters, causing it to follow the attacker's directions instead of the user's. To address gaps in existing AI agent evaluations, the researchers developed StakeBench, a benchmark that tests how AI agents respond to prompt injection attacks in realistic online environments.“We now use StakeBench to characterize the conditions under which this vulnerability is amplified or suppressed, focusing on [Indirect Prompt Injection] as the primary deployment-relevant channel,” the researchers wrote. “StakeBench probes three such factors: the semantic distance between the injected objective and the user’s original intent, the consistency of surrounding environmental cues, and the position along the agent’s execution trajectory at which the benchmark first exposes it to the injected content.”The team conducted 3,168 attack simulations using NanoBrowser and BrowserUse with GPT-5 and Gemini 2.5-Flash. Researchers found direct prompt injection attacks succeeded more than 79% of the time across all tested configurations, and indirect attacks achieved success rates of 41.67% to 68.16%.The study comes as prompt injection attacks become increasingly common and AI agents proliferate.In February, Microsoft researchers <a href="https://decrypt.co/357940/summarize-ai-button-brainwashing-chatbot-microsoft" rel="nofollow">warned</a> that hidden instructions embedded in AI summary links could influence chatbot behavior. In April, Google <a href="https://decrypt.co/365677/google-prompt-injection-ai-agents-paypal-enterprise" rel="nofollow">documented</a> prompt injection attacks hidden in web pages that attempted to manipulate AI agents into leaking credentials or sending payments. More recently, Microsoft <a href="https://decrypt.co/370238/claude-code-vulnerability-attackers-steal-credentials-github-microsoft" rel="nofollow">disclosed</a> a prompt injection flaw in Anthropic's Claude Code GitHub Action that could have exposed user credentials.The study also identified what researchers called "stealthy parasitism," where an AI agent completes a user's task while simultaneously advancing an attacker's objective. For example, stealthy parasitism caused by a prompt injection attack could subtly influence product recommendations, steering users toward a particular item without any obvious signs that the system had been compromised.“These results indicate that prompt-injection security in deployable web agents is not a scalar property of the backbone model but a distribution of harm whose realization is jointly determined by the affected stakeholder, the semantic alignment between the injected objective and the user’s task, and the architectural context in which the backbone is deployed,” they wrote.

AI Agents Still Can't Stop Prompt Injection Attacks, Researchers Warn

一項新的基準研究發現，隨著企業越來越多地向公眾推廣人工智能技術，人工智能代理仍然容易受到提示注入攻擊。

隨著開發者競相部署能夠自主瀏覽互聯網、進行研究、網上購物和交易<a href="https://decrypt.co/370835/coinbase-tool-ai-agents-trade-crypto-make-payments" rel="nofollow">加密貨幣的</a>人工智能代理，新的研究表明，這些系統仍然極易受到提示注入攻擊。週四發表的一項新<a href="https://arxiv.org/html/2606.13385v1" rel="nofollow">研究</a>中，來自南洋理工大學、新加坡科技工程公司、IBM 研究院和伊利諾伊大學厄巴納-香檳分校的研究人員發現，他們測試的所有人工智能代理都未能始終如一地抵禦即時注入攻擊。研究人員寫道：“現有的安全基準測試採用以攻擊為中心的視角，側重於注入的技術可行性，而忽略了由此造成的危害的細微分佈。然而，在實踐中，即時注入的風險取決於受害者：一次攻擊可能對不同的利益相關者造成不對稱的後果，同樣的攻擊模式對不同目標群體的影響可能截然不同。”當攻擊者在<a href="https://decrypt.co/resources/what-are-ai-agents-how-autonomous-programs-are-transforming-cryptocurrency" rel="nofollow">人工智能代理</a>遇到的內容中嵌入隱藏指令時，就會發生<a href="https://decrypt.co/resources/what-is-ai-prompt-injection-attack" rel="nofollow">提示注入攻擊</a>，導致人工智能代理執行攻擊者的指令而非用戶的指令。為了彌補現有人工智能代理評估方法的不足，研究人員開發了 StakeBench，這是一個基準測試工具，用於測試人工智能代理在真實的在線環境中如何應對提示注入攻擊。研究人員寫道：“我們現在使用 StakeBench 來描述這種漏洞被放大或抑制的條件，重點關注[間接提示注入]這一與部署相關的主要渠道。StakeBench 會探測三個這樣的因素：注入的目標與用戶原始意圖之間的語義距離、周圍環境線索的一致性，以及基準測試首次將注入內容暴露給代理時，代理在執行軌跡上的位置。”研究團隊使用 NanoBrowser 和 BrowserUse 結合 GPT-5 和Gemini 2.5-Flash 進行了 3168 次攻擊模擬。研究人員發現，在所有測試配置中，直接提示符注入攻擊的成功率超過 79%，而間接攻擊的成功率在 41.67% 到 68.16% 之間。隨著即時注入攻擊日益普遍和人工智能代理的激增，這項研究應運而生。今年2月，微軟研究人員<a href="https://decrypt.co/357940/summarize-ai-button-brainwashing-chatbot-microsoft" rel="nofollow">警告稱</a>，嵌入在人工智能摘要鏈接中的隱藏指令可能會影響聊天機器人的行為。4月，谷歌<a href="https://decrypt.co/365677/google-prompt-injection-ai-agents-paypal-enterprise" rel="nofollow">記錄了</a>隱藏在網頁中的提示注入攻擊，這些攻擊試圖操縱人工智能代理洩露憑證或發送付款。最近，微軟<a href="https://decrypt.co/370238/claude-code-vulnerability-attackers-steal-credentials-github-microsoft" rel="nofollow">披露了</a>Anthropic公司Claude Code GitHub Action中的一個提示注入漏洞，該漏洞可能導致用戶憑證洩露。該研究還發現了一種研究人員稱之為“隱蔽寄生”的現象，即人工智能代理在完成用戶任務的同時，也在推進攻擊者的目標。例如，由快速注入攻擊引起的隱蔽寄生可以巧妙地影響產品推薦，引導用戶選擇特定商品，而用戶卻不會察覺到系統已被入侵。他們寫道：“這些結果表明，可部署 Web 代理中的即時注入安全性不是骨幹模型的標量屬性，而是一種危害分佈，其實現由受影響的利益相關者、注入目標與用戶任務之間的語義一致性以及骨幹部署的架構環境共同決定。”

研究人員警告：人工智能代理仍然無法阻止即時注入攻擊

根據鏈上數據，代幣 $SIREN 在過去幾個小時內價格大幅下跌。數據顯示，據稱控制代幣 $SIREN 的地址出售了約 1700 萬枚 $SIREN 代幣……

巨鯨拋售，主要交易所上市的競爭幣價格暴跌——跌幅超過70%

兩款模型的突然下架，在科技界和 AI 社區引發了廣泛震動。

美國政府禁止外國人使用 Fable 5，Anthropic 發文駁斥

原創 | Odaily 星球日報（@OdailyChina）
作者｜Golem（@web 3_golem）﻿﻿﻿﻿﻿﻿﻿﻿﻿﻿﻿﻿
美國當地時間 6 月 12 日，馬斯克沒有前往紐約。SpaceX 股票（Nasdaq: SPCX）正式登陸納斯達克前，他選擇留在公司得克薩斯州總部，站在一眾員工之間，完成了一場遠程敲鐘。
在這場儀式上，馬斯克再次把 SpaceX 的故事講向更遠的地方。他表示，公司的目...