Author: jolestar
Last week, I tinkered with an AI Agent, and the day before yesterday I attended an event hosted by ai16z in Beijing, wanting to see what an AI Agent can actually do now and think about what it might be able to do in the future.
The current state of AI Agent reminds me of that meme image, where there's a person hidden inside a vending machine. Everyone has already imagined that AI Agents have started to have autonomous consciousness, but the reality is that there's actually a developer hidden inside the AI Agent. (Here, everyone can imagine the scene, I tried to get AI to generate this image, but found that AI cannot understand "hidden")
The Basic Working Mode of the AI Agent Framework
The AI Agent framework currently plays the role of a binder, connecting clients (Twitter, Discord, Telegram, etc.) and various plugins (different chains, etc.), and the framework provides a basic library (memory storage, session isolation, context generation) to interface with various AI platform APIs.
How the AI Agent Framework Integrates with Applications and Business Scenarios
Since the AI boom last year, various platforms and tools have emerged, and the key is to solve the problem of how AI integrates with applications. Some AI platforms have tried to provide plugins, some have built workflow models, and some traditional applications have embedded AI within the application. But the key here is: 1. Where is the application's interaction entry point? 2. How can AI be integrated with existing business logic.
The interaction entry point that AI platforms provide to users is a dialogue box similar to a chat window, and it's obvious that everyone thinks the interaction with AI applications should be an "anthropomorphic" way. And the smart part of the AI Agent is that it directly connects to all open IMs and social systems, which is obviously easier to accept than creating a new one.
How can AI be integrated with existing business logic? The solution provided by the AI Agent is to let developers embed AI decision-making into business scenarios. Programming languages require certainty, the conditions of if can only be true or false, and cannot handle fuzzy business logic. But through AI, complex logic can be converted into precise conditions, and then seamlessly integrated into business scenarios.
For example, the function of replying to messages in a group, traditional IM Bots need to be triggered by some clear message instructions, but through AI, a method "shouldReplyMessage" can be implemented, give it the context, and it will return true or false.
The role of AI in business logic scenarios is mainly:
1. "Intent" discovery: Through the instructions in the prompt, let the AI discover the "intent" in the user's text message based on the context, and map the intent to specific code.
2. Assist decision-making: By using AI to convert fuzzy complex conditions into definite true/false or enumeration types, and then integrate them into the business logic.
Seeing this, many people may be disappointed with the AI Agent, as many people thought the AI Agent would just be taught a little and then it would be able to do anything. In fact, due to the difficulty of the context limitation of large models, it is not possible (at least currently) to create an omnipotent AI that can do anything. But the good news is that programmers don't have to worry about losing their jobs, as there are still a lot of programmers behind AI, and people still need to stack if-else, but the key difference is that the business boundaries that programs can handle are expanding.
Two Types of AI Agents
At the event, I asked shaw a question, the market has two expectations for AI Agents, 1. AI Agent plays a role itself, has its own ID, brand, and provides services to users. 2. Users have a personal AI Agent, like a personal assistant, to help users handle some business. Which type of AI Agent will be more popular? He felt that both directions will be good, and they may even be combined.
Currently, the market is mainly exploring the first direction. This direction is similar to the AI Agent-ization of services, and in the future there may be no App interfaces, everything will be AI Agent-ized and anthropomorphized. The second direction is the Agent-ization of application clients, where the future application client will be a plugin of the assistant Agent, the local data of the application will become part of the Agent's memory library, and this plugin will also be responsible for communicating with the cloud service Agent. This is a new application architecture model that will change the entire infrastructure.
Infrastructure Requirements for AI Agents
1. The infrastructure needs to be permissionless, otherwise AI Agents will be restricted by various anti-attack strategies, and the services should use economic costs (Gas) to defend against attacks. Platforms with poorer openness will face greater impact, and the open platform fever of the early Web2 era will be reignited.
2. AI Agents need to be able to operate funds to pay for solutions to the above problems.
In other words, future services, whether or not they are based on blockchain, need to support crypto-based identity authentication and crypto-based payments.
The Integration of AI Agents and Blockchains
In addition to the two points mentioned above, how AI Agents integrate with blockchains is a direction that everyone is exploring. At the event, I chatted with Mikkke about the focEliza he is working on. The two types of AI Agents mentioned earlier, at least the first type, need the running or verification environment provided by the blockchain. Because once an AI Agent provides external services, there will be trust issues, and the role it plays is actually the same as a smart contract.
There was controversy over the name "smart contract" back then, as it's just a piece of code, where is the "smart"? AI can make smart contracts truly smart. The difficulty is how to call AI interfaces in a smart contract environment. If letting large models run in a verifiable environment is still a relatively distant path, using a solution similar to Oracle is a more feasible path.
And around the AI Agent, there will be many derived demands, such as how the public knowledge of the AI Agent is obtained, how the AI Agent judges facts, how the AI Agent identifies the same user on different platforms, how the "memory" in the smart contract is stored, and if I have multiple devices with AI Agents, how they share memory?
You will find that the "data on-chain", "relationship on-chain", "DID", "P2P network" and other things that have been done in Web3 all have new meanings and scenarios.
Conclusion
Reusing the conclusion of my 2021 sharing on AI and blockchain, a more AI-friendly Internet is also a more human-friendly Internet. At that time it was just a brain fart, but now the future has arrived.