I. Attention-driven novelty and aversion
Over the past year, as the narrative at the application layer has been disconnected, unable to match the speed of infrastructure explosion, the crypto field has gradually become a game of competing for attention resources. From Silly Dragon to Goat, from Pump.fun to Clanker, the novelty and aversion of attention has led to the involution of this competition. Starting with the most clichéd eye-catching monetization, it quickly evolved to the platform model where the demanders and suppliers of attention are unified, and then silicon-based organisms became the new content providers. Among the myriad carriers of MEME Coins, a kind of existence that can finally reach a consensus between retail investors and VCs has emerged: the AI Agent.
Attention is ultimately a zero-sum game, but speculation can indeed promote the savage growth of things. In our article on UNI, we reviewed the beginning of the blockchain's golden age, where the rapid growth of DeFi was driven by the LP mining era opened by Compound Finance, with the in-and-out of various mining pools with APYs in the thousands or even tens of thousands being the most primitive form of on-chain gaming at the time, although the eventual outcome was the collapse of various mining pools. However, the crazy influx of gold miners did leave unprecedented liquidity on the blockchain, and DeFi ultimately broke free of pure speculation to form a mature track, meeting users' financial needs in payment, trading, arbitrage, staking and other aspects. And the AI Agent is currently experiencing this savage stage, and we are exploring how Crypto can better integrate with AI and ultimately propel the application layer to new heights.
II. How can agents be autonomous
In the previous article, we briefly introduced the origin of AI MEME: Truth Terminal, and the prospects for AI Agents. This article focuses primarily on the AI Agent itself.
We start with the definition of the AI Agent. Agent is a rather old but vaguely defined term in the field of AI, with the main emphasis being on Autonomy, i.e., any AI that can perceive the environment and make reflexive responses can be called an Agent. In the current definition, the AI Agent is closer to an intelligent agent, i.e., setting up a system that mimics human decision-making for large models, which is seen in academia as the most promising way to AGI (Artificial General Intelligence).
In the early versions of GPT, we can clearly perceive that the large models are very human-like, but when answering many complex questions, the large models can only give some plausible answers. The fundamental reason is that at the time the large models were based on probability rather than causality, and secondly they lacked the human capabilities of using tools, memory, planning, etc., and the AI Agent can fill these gaps. So to summarize in a formula, AI Agent = LLM (Large Language Model) + Planning + Memory + Tools.
The large model based on prompts is more like a static person, it only comes to life when we input, the goal of the agent is to be a more real person. Currently, the agents in the circle are mainly based on the Meta open-sourced Llama 70b or 405b version (with different parameters) fine-tuned models, with the ability to remember and use API access tools, and may need human help or input in other aspects (including interaction and collaboration with other agents), so we can see that the main agents in the circle are currently in the form of KOLs on social networks. To make the agent more human-like, it needs to access planning and action capabilities, and the sub-item of thinking chain is particularly critical in planning.
III. Thinking Chain (CoT)
The concept of Thinking Chain (Chain of Thought, CoT) first appeared in a paper published by Google in 2022 titled "Chain-of-Thought Prompting Elicits Reasoning in Large Language Models", which pointed out that the model's reasoning ability can be enhanced by generating a series of intermediate reasoning steps, helping the model better understand and solve complex problems.
A typical CoT Prompt contains three parts: a clear task description, the logical basis or principles that support the task solution, and a specific solution demonstration. This structured approach helps the model understand the task requirements, gradually approach the answer through logical reasoning, thereby improving the efficiency and accuracy of problem solving. CoT is particularly suitable for tasks that require in-depth analysis and multi-step reasoning, such as math problem solving and project report writing. For simple tasks, CoT may not bring obvious advantages, but for complex tasks, it can significantly improve the model's performance, reduce the error rate, and improve the quality of task completion through a step-by-step solution strategy.
CoT plays a key role in the construction of AI Agents. AI Agents need to understand the received information and make reasonable decisions based on it. CoT provides an orderly way of thinking, helping the Agent effectively process and analyze the input information and convert the analysis results into specific action guidelines. This method not only enhances the reliability and efficiency of the Agent's decision-making, but also improves the transparency of the decision-making process, making the Agent's behavior more predictable and traceable. By breaking down the task into multiple small steps, CoT helps the Agent consider each decision point in detail, reducing the risk of erroneous decisions due to information overload. CoT makes the Agent's decision-making process more transparent, making it easier for users to understand the Agent's decision-making basis. In interaction with the environment, CoT allows the Agent to continuously learn new information and adjust its behavioral strategies.
As an effective strategy, CoT not only enhances the reasoning ability of large language models, but also plays an important role in building more intelligent and reliable AI Agents. By utilizing CoT, researchers and developers can create intelligent systems that are more adaptable to complex environments and have a high degree of autonomy. CoT has shown its unique advantages in practical applications, especially in dealing with complex tasks, where breaking down the task into a series of small steps not only improves the accuracy of task solving, but also enhances the interpretability and controllability of the model. This step-by-step problem-solving approach can greatly reduce the risk of erroneous decisions when facing complex tasks due to information overload or complexity. At the same time, this approach also improves the traceability and verifiability of the entire solution.
The core function of CoT is to integrate planning, action, and observation, bridging the gap between reasoning and action. This mode of thinking allows AI Agents to formulate effective countermeasures when predicting possible abnormal situations, as well as to accumulate new information and verify pre-set predictions while interacting with the external environment, providing new reasoning basis. CoT is like a powerful engine of accuracy and stability, helping AI Agents maintain high efficiency in complex environments.
IV. The right pseudo-demand
What aspects of the AI technology stack should Crypto be combined with? In last year's article, I believed that the decentralization of computing power and data was the key step to help small businesses and individual developers save costs, and in Coinbase's Crypto x AI segmentation track this year, we see a more detailed division:
(1) Computing layer (focused on providing GPU resources for AI developers);
(2) Data layer (supporting the decentralized access, orchestration and verification of AI data pipelines);
(3) Middleware layer (supporting the development, deployment and hosting of AI models or agents);
(4) Application layer (user-facing products that leverage on-chain AI mechanisms, whether B2B or B2C).
In these four layers, each has a grand vision, and their goals can be summarized as fighting against the monopoly of Silicon Valley giants in the next era of the Internet. As I said last year, do we really have to accept that Silicon Valley giants monopolize computing power and data? In their monopolized closed-source large models, their internal workings are a black box, and science, as the most believed religion of today's humanity, every sentence answered by future large models will be seen as truth by a large part of the people, but how is this truth to be verified? According to the plan of the Silicon Valley giants, the agents will ultimately have powers beyond imagination, such as having the right to pay from your wallet, use your terminal, how to ensure that there is no evil intent?
Decentralization is the only answer, but sometimes do we need to reasonably consider the payers of these grand visions? In the past, we could make up for the idealized errors through Tokens without considering the business closed loop. But the current situation is very severe. Crypto x AI needs to be combined with the real situation for design, such as how to balance the supply and demand of the computing power layer when the performance is lost and unstable, in order to achieve the competitiveness of centralized cloud. How many real users will the data layer projects have, how to verify the authenticity and effectiveness of the data provided, and what kind of customers need these data? The same goes for the rest of the second layer, in this era we don't need so many seemingly correct false demands.
V. MEME Broke Out of SocialFi
As I said in the first paragraph, MEME has broken out of the SocialFi form that fits Web3 at an extremely fast pace. Friend.tech is the first shot of this round of social applications, but unfortunately it failed in the hasty Token design. Pump.fun has verified the feasibility of pure platformization, without any Token, without any rules. The demanders and suppliers of attention are unified, you can post meme pictures, live broadcast, issue coins, leave messages, and trade on the platform, everything is free, and Pump.fun only charges a service fee. This is basically the same as the attention economy model of current social media such as YouTube and Ins, the only difference is the target of charging, and Pump.fun is more Web3 in terms of gameplay.
Base's Clanker is the culmination, benefiting from the integrated ecosystem personally operated by the ecology, Base has its own social Dapp as an auxiliary, forming a complete internal closed loop. Intelligent Agent MEME is the 2.0 form of MEME Coin, people always seek novelty, and Pump.fun is currently at the forefront of the trend, from the trend point of view, the absurd memes of silicon-based organisms replacing the vulgar memes of carbon-based organisms is just a matter of time.
I have mentioned Base countless times, but the content mentioned each time is different. From the timeline, Base has never been the first mover, but it is always the winner.
VI. What Else Can Intelligent Agents Be?
From a pragmatic perspective, intelligent agents are unlikely to be decentralized for a long time in the future. From the perspective of the traditional AI field in building intelligent agents, it is not a problem that can be solved simply by decentralizing and open-sourcing the reasoning process, it needs to access various APIs to access Web2 content, its operating costs are very high, and the design of the thinking chain and the collaboration of multiple intelligent agents often still depend on a human as a medium. We will experience a very long transition period until a suitable fusion form appears, perhaps like UNI. But like the previous article, I still feel that intelligent agents will have a huge impact on our industry, just like the existence of Cex in our industry, it is not correct but very important.
The article "A Survey of AI Agents" issued by Stanford & Microsoft last month described the applications of intelligent agents in the medical industry, intelligent machines, and the virtual world, and in the appendix of this article, there are already many experiments of GPT-4V as an intelligent agent participating in the development of top-level 3A games.
We don't need to be too demanding on the speed of its integration with decentralization, I hope that the puzzle that intelligent agents first need to fill in is the bottom-up capability and speed, we have so many narrative ruins and blank metaverses that need to be filled, and at the appropriate stage we can consider how to make it the next UNI.
Welcome to join the official community of BlockBeats:
Telegram subscription group: https://t.me/theblockbeats
Telegram discussion group: https://t.me/BlockBeats_App
Twitter official account: https://twitter.com/BlockBeatsAsia