Smart Payment: The Evolution Path of the Next Generation Payment System

11-02

This article is machine translated

Show original

Editor's Note: We are standing at a new tipping point. Agentic payments are reshaping the fundamental logic of transactions. From internal settlements within ChatGPT to micropayments between agents, and then to a new online order where machines pay for content—the landscape of the "agent economy" is gradually taking shape.

If you are interested in the integration of AI and blockchain, the implementation path of next-generation payment protocols, or are thinking about the future trend of business automation, this article is worth your time to read carefully.

The following is the original text.

Foreword

This is a lengthy article, but definitely worth reading. It brings together insights from several leading builders who are shaping the future of agentic payments. We'll explore the real problems they're trying to solve, the possible practical implementations of these technologies, and the underlying bottlenecks.

Think of it as a guided, cutting-edge journey. Ten pages of content, covering ideas, experiments, and experiences from those building the infrastructure for the machine economy. Get ready to go.

As the coordinator of ERC-8004 and an AI advisor to the Ethereum Foundation's Decentralized AI (dAI) team, I have worked closely with builders, researchers, and protocol teams over the past few months, focusing on the intersection of stablecoins, decentralized infrastructure, and AI. This has allowed me to observe the real-time evolution of these technologies firsthand. This article is not only a presentation of research findings but also a real-world perspective from the builders of the smart economy.

Special thanks to the following individuals for their review and discussion (listed in alphabetical order by last name):

@louisamira (ATXP), @RkBench (Radius), @DavideCrapis (Ethereum Foundation), @nemild (Coinbase/x402), @Cameron_Dennis_ (Near Foundation), @marco_derossi (Metamask), @dongossen (Nevermined), @jayhinz (Stripe/Privy), @sreeramkannan (EigenCloud), @kevintheli (Goldsk y), @MurrLincoln (Coinbase/x402), @benhoneill (Stripe/Bridge), @programmer (Coinbase/x402), @FurqanR (Thirdweb), @0xfishylosopher (Pantera Capital).

The Transformation of Smart Payment

A month ago, Stripe and OpenAI launched a new feature that could potentially revolutionize online shopping: you can now buy things directly within ChatGPT. No forms, no redirects, no checkout pages. Simply say, "Find me a handmade ceramic mug," and the system will automatically complete the payment using Stripe's "shared payment token."

This process appears remarkably smooth, even somewhat magical, but it's underpinned by a highly centralized architecture that may limit the space for true innovation in the future. Payment tokens, settlement channels, and even user identities are all controlled by the OpenAI and Stripe platforms. While convenient, smart agents in this model cannot be freely combined and can only operate within a specific ecosystem. This demonstrates future possibilities but also serves as a reminder that without open standards and a neutral settlement layer, smart payments will be platform-locked, hindering their true potential.

At the same time, this new payment process also marks a larger shift: the actual transaction is no longer conducted by the user, but by an "agent." The interface we enter has begun to compare prices, negotiate, and even make payments on our behalf. Commerce is gradually being swallowed up by "intelligent agent-based commerce."

Currently, it seems that three things are happening simultaneously: agents are beginning to conduct transactions on behalf of humans; these transactions are likely to be settled on encrypted networks rather than traditional financial systems; and this may become a breakthrough application scenario for the true integration of blockchain and artificial intelligence.

Why stablecoins and blockchain? Because the form of these transactions is completely different from the models designed by Visa or PayPal. The smart agent economy is full of small, conditionally triggered, composable, high-frequency payments—fast, granular, and wide-ranging.

After speaking with Robert Bench of Radius, we found that "3V3C" is a very apt descriptive model: Velocity, Volume, Value, Conditional, Composable, and Cosmopolitan.

We observed three emerging behavioral patterns:

1. Humans pay agents (2C, 2B, complex optimization scenarios);

2. The agent makes payments to other agents or humans.

3. The agent makes payments to the entire network.

All three behaviors break the basic assumptions of traditional payment systems.

1. Human → Agent

Chat interfaces are quietly becoming a new entry point for consumers. Transactions that were previously initiated in browsers are now being completed through conversations.

You can already purchase Etsy items directly through ChatGPT's "Instant Checkout" feature, and Shopify will also integrate this process in the future, powered by Stripe. Google, Amazon, and Perplexity are also testing similar shopping models, allowing AI assistants to help users discover and purchase products within a chat window.

These AI front-ends are becoming digital stores, especially in the retail e-commerce (2C) scenario—product discovery, price comparison, and purchase are all completed in one process. Over time, people will increasingly rely on their own AI agents as personal shopping guides, travel planners, or booking assistants.

Interestingly, these agents behave differently from humans: they can monitor prices in real time and automatically place orders when deals are available; coordinate multiple transactions (such as booking flights and hotels simultaneously); and pay for data or services on demand, rather than relying on subscriptions (a point we will discuss in detail in the "Agents → Networks" section).

In the short term, most of these payment processes will still be completed through traditional channels such as Stripe or Visa—and that's fine. For B2C retail e-commerce, the existing infrastructure is sufficient to support the "human → agent" interface, at least for now.

The real impact of encrypted payments lies in global procurement (B2B).

Many overseas merchants and manufacturers still face settlement delays and high costs due to difficulties accessing SWIFT or traditional correspondent banking systems. For example, in Yiwu, China, the world's largest wholesale market for small commodities, most small businesses have never even heard of stablecoins. However, once the regulatory environment matures, this will become a natural application scenario.

Stablecoins enable instant, low-cost, and transparent circulation of value across borders—just as crypto remittances have surpassed Western Union in some regions.

Whether on the consumer or enterprise side, we will see new types of user behaviors that were previously impossible: complex, conditionally triggered transactions that conform to the "3V3C" model, automatically completed by agents in the background. Especially as Large Language Models (LLMs) become more intelligent and their operating costs are further reduced, the cost savings from these transactions will far exceed the required token fees.

for example:

A single purchasing agent can monitor multiple multinational suppliers simultaneously, automatically splitting orders to the cheapest manufacturers and negotiating shipping costs within budget.

A creative agency can package subscriptions to multiple SaaS tools and dynamically renew or cancel services based on usage.

This also means that agents must be "composable": the output of one agent can become the input of another, forming complex multi-step workflows (such as agent clusters or cross-model thought chains). In the past, we talked about "funding Lego," and now we also need "agent Lego."

In practice, "composability" means the need for standardized APIs, message formats, and access control. Without these, proxies are like applications without APIs—isolated and unable to collaborate.

Therefore, these transactions are too complex, too frequent, and too reliant on combination and coordination to be coordinated by humans or traditional payment systems—but are a piece of cake for agents running on programmable payment systems.

2. Agent → Agent

In the future, agents will need to "hire" other agents—or even humans—to complete tasks.

Existing business models (subscription, licensing, paywalls) are not suitable for interactions between autonomous software. Payments between agents are often charged based on the number of calls, the number of tokens, or the number of inferences, and the amounts can be as low as a few cents or even less.

Imagine a research agent purchasing 100 API calls from a data agent; a design agent paying a compute node for GPU usage time. These are machine-to-machine transactions, frequent but small in value.

For example, one agent might need to pay another agent $0.003 for 100 API calls, or $0.15 for GPU computation, or even just $0.0001 per inference.

Traditional payment systems cannot handle transactions of this scale—credit cards charge a fixed fee (such as 2.9% + $0.30) per transaction, making them simply unusable in this scenario.

However, from a user experience perspective, these transactions may not necessarily be settled in the form of "high frequency, small amount". For example, on platforms such as OpenRouter, enterprises send millions of API calls every month, and the settlement method is to recharge points with stablecoins. This method is more efficient than going through the payment process for each transaction.

A more futuristic scenario is that each robot is equipped with an agent responsible for tasks, data, and operations (possibly also through prepaid credits). For example, a drone might need to pay for weather data, navigation updates, or temporary use of a private delivery route.

This is why we need a new programmable payment structure. Agents should be able to: set budgets and rules; prepay fees; and settle accounts instantly upon completion of the task, along with proof of work.

In other words, crypto payments enable "atomic payments" between autonomous entities.

Over time, proxy payment transactions will no longer be limited to AI services. They may directly "employ" human contributors globally, especially in international markets where stablecoins already have practical payment capabilities. This trend is not far off—we have already seen related experiments in our discussions with the builders, and large-scale applications may emerge within the next one to two years.

This model is very similar to the logic of remittances. Imagine a freelance platform for agents, similar to Fiverr:

A marketing agency can automatically commission dozens of micro-influencers in Southeast Asia and automatically pay them after their interaction data reaches a preset threshold;

A data labeling agency can recruit labelers from Kenya or Bangladesh and pay them small amounts of money in real time on a task-by-task basis, instead of relying on bulk invoices for settlement.

Once the agent can transfer funds instantly and globally, the labor itself begins to resemble an API call.

From a market design perspective (which is a unique advantage of cryptographic systems), another trend will emerge when there are hundreds of thousands of autonomous entities worldwide, comprised of both humans and agents: an intent-based auction market. Agents will compete for task requests.

The best-performing agents will receive rewards (such as stablecoins, reputation scores, or on-chain credit); poorly performing agents may lose their deposits or reputation. This is exactly the vision we envisioned and hope to build in ERC-8004.

A preliminary model might include:

1. Intent Layer: A shared proxy registration system (such as ERC-8004) is used to issue structured requests and verify proxy identity;

2. Bidding Layer: Task allocation is completed through mechanisms such as Dutch auctions or English auctions;

3. Evaluation Layer: The completion status of the task is verified by the public, other AI agents, or oracles, and rewards are automatically distributed;

4. Settlement layer: Payments are made via stablecoins, and reputation and staking status are updated on ERC-8004.

In the past, decentralization was often considered inefficient—partly due to the slowness of human action and the high cost of coordination. But now, agents are eliminating these bottlenecks: they can continuously assess who is best suited to perform the task, what prices are reasonable, and which data is trustworthy.

Blockchain plays the role of a "state coordination layer" here—an immutable shared memory system used to record results, deposits, and points; while stablecoins are micropayment channels for real-time value exchange (payment per answer, payment per action).

This complex "agent-human" collaboration scenario is precisely the problem that blockchain and stablecoins are best suited to solve.
Interoperability enables agents to communicate, and composability enables them to collaborate.

3. Agent → Network

Another noteworthy trend is that internet users are no longer just humans; an increasing amount of content is being crawled, read, and interacted with by AI agents, and in the future, these agents may even dominate the process. This means that websites will no longer charge only humans, but will begin to charge machines—a practice we call "pay-per-crawl."

For example, publishers are fighting back against unrestricted content scraping. Anthropic recently paid $1.5 billion to settle a copyright lawsuit with an author—one of several cases testing whether AI companies can freely use copyrighted content. OpenAI, Microsoft, Meta, and others are also embroiled in similar controversies. A plausible outcome might be a "pay-per-view" model for training data and content usage.

Meanwhile, Cloudflare (which reportedly handles about 20% of web page requests) is experimenting with a new model: websites can charge agents nano-level fees (even less than micropayments) to allow them access to their data. They also recently launched their own stablecoin, NET Dollar.

This is precisely where crypto payments are coming into play again.

Websites and APIs can open a "pay-to-play" interface, where agents can read, query, or consume content for a few cents or even less, without subscriptions or advertising. This transforms the web into a system of microservices, where value flows in real time, no longer dependent on monthly billing cycles.

If you're interested in the early days of the internet with the "402 status code" and related discussions by Andreesen and others, Jay Yu of Pantera Capital has written an excellent article that delves into this evolution.

In reality, the economic model of "pay-per-crawl" will exhibit a power-law distribution. Only a few high-traffic or high-value websites—those possessing the data that proxies truly need—will proactively integrate this type of monetization logic. For most websites, the cost of measuring, charging, and settling proxies' traffic will far outweigh the revenue. In other words, we believe that ultimately only a few large publishers will capture the majority of the revenue, while long-tail websites will remain open access or unable to monetize.

This is precisely where intermediary platforms like Cloudflare could change the curve. If Cloudflare allows websites to "enable proxy payments" with a single toggle—and handle authentication, metering, and billing via protocols like x402 or Web Bot Auth—the barrier to entry will be significantly lowered.

Cloudflare can automatically identify authorized proxy requests, collect fees on behalf of websites at the nanometer level, and automatically distribute revenue.

In this model, the open network itself will gain a native machine commerce layer: any webpage can become a billable API, and any agent can seamlessly pay while browsing, crawling, or learning.

This trend extends beyond data access. Almost all online services that can be used on a per-use basis are likely to shift to a "pay-as-you-go" model in the future. In a conversation with Louis Amira, co-founder of ATXP, we discussed how businesses can open up new revenue streams through proxy payments. Here are a few examples: LegalZoom can charge $2 per NDA; Netflix could charge per episode at $0.50 if the payment experience is smooth enough; Replit could charge per token, allowing unlimited "vibe-code," charging $1.23 per million tokens; PitchBook or Bloomberg could allow proxies to pull valuation models once, charging $0.25; and hospitals could charge per record, providing anonymized cancer scan data for model training.

Louis started taking screenshots to document the "forced upgrades" or "unnecessary paywalls" he encountered—companies that could have turned him into a customer by charging per use.

Ideally, enterprise developers could quickly launch temporary API interfaces, billed on a per-use basis rather than by monthly subscription; writers or researchers could sell individual texts, charts, or datasets per query.

Conversely, proxies can also access non-public data APIs on request, querying supplier data that cannot be crawled from web pages, using prepaid micro-requests. This model is well-suited for long-tail APIs and enterprise datasets.

Coinbase's CDP team has already made early attempts on Payments MCP, allowing LLMs to use on-chain tools such as wallets and payment functions without API keys.

The internet is no longer a collection of subscription bundles, but more like a "real-time billing system"—every interaction has pricing, payment, and settlement, and value is constantly flowing.

We're still in the early stages, but integration is underway.

After completing a full round of research, we concluded that while the potential of smart agent payments is enormous, it is still in its early stages. One of the biggest challenges is that payments themselves are one of the most regulated and complex areas of authorization on the internet. Their implementation often depends not so much on technological feasibility, but on the ability to integrate and interoperate with large enterprises and financial networks. This makes progress inherently slow.

For startups, even if the underlying technology already exists, it is almost impossible to conduct meaningful experiments without access to banks, card organizations, or mainstream payment processors.

In the future, enterprise-level and compliance-oriented solutions are likely to emerge. Therefore, teams like Catena Labs are building Agent Commerce Kits, focusing on agent authentication and payment interactions between individuals and agents, targeting licensed financial institutions, regulatory compliance, and enterprise-level applications. PayPal is also likely to explore similar directions.

How far are we from truly intelligent payments?

Currently, most so-called "agents" are actually only semi-autonomous systems. Technically, they are more like complex workflow automation tools than intelligent agents capable of autonomous shopping or negotiation. As Kevin Li of Goldsky said, "You can't really sell 'fully automated business' yet; most AI companies are still doing workflow automation."

The short-term opportunity lies in the "semi-autonomous middle ground": human-initiated actions trigger API-level per-use settlements, completed through stablecoin channels. While these processes are not yet fully intelligent agent behaviors, they are already using the same infrastructure—low-latency programmable wallets, per-call metering, and instant settlement—the core components upon which future truly "agent-to-agent" businesses rely.

Meanwhile, the underlying blockchain also needs to evolve. Smart payments require stablecoin channels to have high throughput, low latency, and privacy protection. Next-generation payment-oriented public chains are being explored by major players, such as Stripe's new chain Tempo and Circle's native chain. We also expect more teams focusing on proxy and stablecoins to emerge in the Ethereum L2 ecosystem (such as Thirdweb). All of this indicates that the infrastructure for programmable money is being rebuilt from scratch to support millions of micro and nanopayments per second.

Furthermore, programmable wallets and server-side architectures must be upgraded in tandem. If wallets still assume human custody of seed phrase, none of this will be possible. Smart commerce requires policy-based server-side hosting—with programmable budgets, rate limits, spending limits, multi-signature/TEE controls, and auditable authorization mechanisms.

This is precisely the significance of programmable wallets: they provide agents with callable key management and policy execution capabilities without requiring them to "hold a seed phrase." As Jamie Hinz of Privy points out, four years ago we might have been trying to transform Fireblocks or MetaMask into this form; today, the entire technology stack is being tailored for agents, enabling them to complete transactions within a policy framework rather than relying on cryptography—security and automation are beginning to merge, no longer contradictory. (For a deeper understanding, I recommend reading Privy's article on natural language control and policy execution.)

More importantly, this trend is already emerging. Even Visa and Mastercard are adjusting their networks to accommodate smart agent commerce, launching Trusted Agent and Agent Pay protocols based on Web Bot Auth—indicating that authentication, authorization, and settlement are rapidly converging, whether in blockchain or traditional payment channels.

We may only need one or two key breakthroughs to truly realize this vision.

Once payments become programmable, the way we behave on the internet will change. Every action can be priced, paid for, and settled in real time. Every agent, whether a model or a human, can be instantly rewarded for their contributions.

As infrastructure gradually improves, two key standards are emerging: ERC-8004 provides a trust layer, allowing agents to discover and collaborate without a centralized intermediary; x402 enables instant, frictionless payments between agents.

Together, they form the underlying infrastructure of the intelligent agent economy.

We envision a future where Agent A finds Agent B through the ERC-8004 registry, negotiates service details, and then instantly completes the payment via a smart payment protocol like x402, settling on Ethereum, a neutral financial layer.

For true agent collaboration to be achieved, they must be interoperable—able to discover each other, communicate, and exchange data through shared protocols; and they must also be composable—capabilities can be layered on top of each other.

As Coinbase's Lincoln Murr stated, "If machine-to-machine payments are dominated by stablecoin channels, it could drive widespread adoption of stablecoins across the internet. While Visa and Mastercard still dominate human-to-human payments, intermediaries could become the 'Trojan horse' driving the adoption of crypto payments."

It took the internet 20 years to go from web pages to applications, and another 15 years to go from applications to platforms. Agencies will compress that cycle. Business will no longer be something you "actively do," but rather a process that "happens automatically"—quietly, continuously, and ubiquitously.

[ Original Link ]

Click to learn about BlockBeats' job BlockBeats.

BlockBeats to the official BlockBeats community:

Telegram subscription group: https://t.me/theblockbeats

Telegram group: https://t.me/BlockBeats_App

Official Twitter account: https://twitter.com/BlockBeatsAsia

Source

Disclaimer: The content above is only the author's opinion which does not represent any position of Followin, and is not intended as, and shall not be understood or construed as, investment advice from Followin.

Add to Favorites

Comments

Relevant content