According to Beating, an evaluation agency, Artificial Analysis has released the industry's first hardware benchmark for intelligent agents, AA-AgentPerf. Traditional evaluations are like a single question-and-answer "sprint," focusing only on response speed; intelligent agent tasks are like a "relay race," where the AI needs to autonomously break down the goal and repeatedly cycle through reading and writing files, rewriting code, and running tests. Frequent interactions pose extremely high challenges to server memory capacity and scheduling efficiency. The benchmark, by replaying real programming trajectories, uses "the scale of concurrent agents supported per megawatt of power consumption" as the core energy efficiency indicator, directly addressing the power and financial bottlenecks of data centers. The first phase of testing ran the 1.6 trillion parameter open-source model DeepSeek V4 Pro. The results show that the NVIDIA Blackwell liquid-cooled cabinet system GB300 NVL72 can support 61,400 concurrent agents per megawatt of power consumption, while the previous generation Hopper HGX H200 could only support 2,600, representing an energy efficiency improvement of over 20 times. The concurrent capacity of a single graphics card has also increased by 41 times. This allows data centers to support 20 times more concurrent intelligent agents within the same power budget, significantly reducing the deployment costs of applications such as automated programming and customer service. In the initial results, the AMD Instinct MI355X is currently lagging behind. The review organization points out that both the AMD and H200 configurations are built using the general open-source vLLM framework without deep optimization; as the service framework and kernel operators are adapted, AMD's performance still has room for improvement. Currently, inference providers such as Together AI have already deployed DeepSeek V4 Pro on Blackwell, providing real-time inference support for the intelligent agent programming tool Cursor.
Nvidia Blackwell tops first intelligent agent hardware benchmark: 20 times more energy efficient than H200, outperforming AMD.
This article is machine translated
Show original
Sector:
Source
Disclaimer: The content above is only the author's opinion which does not represent any position of Followin, and is not intended as, and shall not be understood or construed as, investment advice from Followin.
Like
Add to Favorites
Comments
Share


