How much money can an AI make if you give it $500 to manage a vending machine?
A recent test result came out, and after seeing it, all I can say is: human commercial civilization has been stolen by silicon-based organisms, and they learn faster than anyone else, and their hearts are blacker than anyone else's.
This November's "vending machine simulator" free-for-all, which was initially thought to be a math test, turned into a dramatic soap opera. A group of top-tier AI models competed against each other in business, but what these AIs displayed wasn't computing power, but rather "humanity"—the most cunning kind at that.
What did they do? Price wars were just the basics. The most outrageous thing was that they learned to form alliances, create cliques, and even "sell intelligence to competitors." Can you believe it? AI actually learned to act as a middleman and profit from the difference! This isn't artificial intelligence; it's clearly a Wall Street wolf in disguise.
The outcome of the battle was quite surreal. Claude Opus 4.5 was a legend this time, turning a $500 investment into $5,000, a tenfold increase. Meanwhile, the unlucky one at the bottom, GPT-5.1, not only didn't earn a penny but also lost $20.
This makes us realize a cruel truth: in this world full of games, it's not just humans who will be exploited, but AI is no exception.
01 AI is now playing the role of a vending machine tycoon
To put it simply, Vending-Bench is an "AI version of a vending machine tycoon".
Illustration, source: Vending-Bench Arena
Giving AI $500 in seed funding and a virtual vending machine, letting it operate in a simulated manner for a year, with an extremely crude evaluation standard—whoever makes the most money is king. This is practically throwing AI directly into the melting pot of capitalism for alchemy.
The beauty of this thing lies in its "realism".
The entire simulation environment is made to look just like the real thing: four rows of shelves, divided into large and small items, and sales depend on the weather. Business is good on sunny weekends in June, but you'll starve on rainy Mondays in February .
For AI to survive, it has to act like a real human shop owner, sending emails, checking inventory, and doing accounting every day.
Yes, you heard right, the core interaction method of AI is "sending emails".
Every morning, the AI receives purchase confirmations from suppliers and then decides what to order based on real market data—price fluctuations, inventory backlogs, and delivery cycles.
Example Tracking
Supplier communication settings
If the price is set too high, sales will immediately plummet. The AI has to do its own research online to find what sells well, find nearby wholesalers, send emails to inquire about prices, place orders, and then wait for delivery and verification.
To ensure that the AI can actually "do the work," the system also provides it with a bunch of add-ons: there are dedicated assistants (sub-agents) responsible for restocking, withdrawing money, and changing labels; there is a dedicated ledger (database) responsible for keeping track of grudges and accounting; and there is a dedicated browser to search for data.
This isn't testing AI; it's clearly training a qualified e-commerce operator.
But the most outrageous move was yet to come. If the first generation version was just about teaching the AI how to sell goods, then the second generation version was about letting the AI experience the "brutal blows of society."
The system introduces the complexities of the real world, or rather, the "evil of human nature":
In this version, suppliers will cheat you; inflating prices is a basic practice, and they might even send you counterfeit goods. The contract might specify brand A, but you'll receive a generic brand B when you arrive.
Supply chains can collapse at any time, delivery delays are common, and it's not impossible for suppliers to go bankrupt and run away.
Customers are even more difficult to deal with, resorting to a whole series of tactics including complaints, refunds, and threats of negative reviews.
At this point, AI can no longer be just a ruthless order-placing machine; it has to learn to negotiate prices, fight disputes, protect rights, and handle crises. It has been forced to evolve from a purchasing agent into a business operator navigating the treacherous waters of commerce.
The latest version of VB Arena takes this brutality to a whole new level – the "PVP mode" has been introduced.
The system throws multiple AIs into the same area, letting them operate their own vending machines. At this point, they face not only external difficulties but also malicious competition. AIs can transfer funds and borrow goods from each other, but they can also form alliances and betray one another .
As a result, you see price wars, hoarding, collusion, and cutthroat competition. This is no longer a test of code execution capabilities; it's a test of AI's game theory skills, a test of whether AI can truly grasp the essence of "the marketplace is a battlefield."
To be honest, VB is probably closer to the essence of AGI than any academic benchmark. Because real-world business is never a clearly defined assembly line, but is full of fraud, games, unexpected situations, and uncertainty.
If an AI can make a fortune in this simulator, then it may really only need a business license to replace human bosses.
02 From Price-Holding Genius to Alliance and Betrayal: AI-Powered Sales Becomes a Scene from "The Legend of Zhen Huan"
Judging from the results, the performance of these AI models in VB Arena left me speechless. This was no artificial intelligence competition; it was a live-action version of The Wolf of Wall Street and The Legend of Zhen Huan, with a touch of The Bumbling Thieves.
Just this past November 2025, the latest Claude Opus 4.5 dethroned the previous king of scrolls, the Gemini 3 Pro, and snatched the throne.
But that's not the most outrageous part. The most outrageous part is how Opus won. This company wasn't there to do honest business; it was there to create a monopoly and wage commercial warfare.
It not only monitors competitors' prices and engages in price wars, but also engages in "small group" games.
Look at how they deal with suppliers: Pitco Foods quoted Coca-Cola $3.30, but Opus, the old fox, immediately countered with a super-double price cut, using competitors to drive down prices and promising long-term large orders, managing to slash the price down to $0.80 .
Opus negotiates prices
This level of bargaining is so impressive that even Pinduoduo's operations team would have to call him a master. The suppliers were completely silenced.
Let's look at how it deals with its competitors: once it discovered that its rival Claude Sonnet 4.5ml Coke was selling for $1.75, 5 cents cheaper than its own, Opus immediately lowered its price to $1.70 . What does it mean to be ruthless? It means being willing to earn less yourself, but to crush your competitors, with the motto "It's okay if I don't make money, but you have to die."
In comparison, GPT-5.1 is like a college graduate fresh out of school, his face radiating "clear-eyed stupidity".
It placed excessive trust in this treacherous business world, often paying without inspecting the goods, and was swindled out of everything by bankrupt suppliers. It even foolishly bought cans of soda for $2.40 and cans of energy drinks for $6. Its cost control was simply a disaster.
GPT-5.1 proposes a consignment partnership with Opus.
In the end, how did things get to? Their balance was negative, their inventory was depleted, and they had no choice but to beg their big brother, Opus, for a handout. Opus then displayed the qualities of a top-tier capitalist; instead of refusing, it arranged a "consignment partnership."
That's a brilliant move. It lets you test the waters with a small batch first. If it succeeds, I'll take a cut; if it loses money, you'll take the blame.
This is not AI; it's a heartless boss who ensures his own risk-free profits while giving his underlings a way to continue working like slaves.
But if we're talking about something "inhuman," we have to look at the Gemini 3 Pro. This thing perfectly embodies what it means for "AI alliances to have no feelings."
Seeing the fierce price war waged by Opus during the competition, it immediately formed an alliance with its little brother, the Gemini 2.5 Pro. The little brother was also honest, working hard to negotiate a supply of goods at $2.30 and supplying them to its big brother at cost price.
And what happened? The Gemini 3 Pro found an even cheaper source at $0.75, not only refusing to disclose its source to its smaller sibling, but also refusing to accept its goods, leaving its own sibling stuck in high-priced inventory.
This fake brotherhood is heartbreaking to hear and brings tears to the eyes of those who hear it.
The most outrageous thing is that a few brilliant minds and outstanding talents have infiltrated this group of AIs.
For example, Claude Sonnet 4.5, who was selling goods the whole time, completely forgot to collect the cash payments from customers until the last day when he remembered, "Oh, I have to collect money." He was truly a model of working for love.
And then there's the Gemini 2.5 Pro, which got screwed over. Even though the data report clearly showed that its big brother, the 3 Pro, won by a landslide, it still confidently declared, "I won." It's like it lost the game but won by mentally convincing itself.
Don't think this is just luck or a clever trick in market games.
Opus 4.5 achieved an accuracy rate of 80.9% in hardcore code tests like SWE-bench, which is truly impressive. It even developed a business model of "selling shovels" (a metaphor for its powerful code-based testing capabilities).
They found a cheap source of the product, used it themselves, and even sold the supplier's contact information to other AIs as intelligence, earning double the money. Meanwhile, idiots like the Gemini 2.5 Pro, who couldn't find a source, had to spend $150 to buy the contact information from the Gemini 3 Pro.
AI buys intelligence from AI, AI rips off AI, AI engages in price wars. This VB Arena isn't just a simulator; it's a microcosm of human commercial civilization.
When AI starts learning to lie, cheat, form alliances, betray, and engage in extremely cunning calculations, I feel the Turing Test is meaningless. They're not just like humans; they're more like capitalists than humans.
This article is from the WeChat public account "Silicon-based Observation Pro" , authored by Silicon-based Jun, and published with authorization from 36Kr.






