Author: jolestar
Last week, I tinkered with an AI Agent, and the day before yesterday I attended an event hosted by ai16z in Beijing, wanting to see what an AI Agent can actually do now and think about what it might be able to do in the future.
The current state of AI Agents makes me think of that meme image of a person hidden inside a vending machine. Everyone has already imagined that AI Agents have started to have autonomous consciousness, but the reality is that there is actually a developer hidden inside the AI Agent. (Here, everyone can imagine the scene, I tried to have AI generate this image, but found that AI cannot understand "hidden")
The basic working mode of the AI Agent framework
The AI Agent framework currently plays the role of a binder, connecting clients (Twitter, Discord, Telegram, etc.) and various plugins (various chains, etc.), and the framework provides a basic library (memory storage, session isolation, context generation) and then connects to various AI platform interfaces.
How the AI Agent framework integrates with applications and business scenarios
Since the AI boom last year, various platforms and tools have emerged, and the key is to solve a problem: how to integrate AI with applications. Some AI platforms have tried to provide plugins, some have built workflow models, and some traditional applications have embedded AI within the application. But the key here is: 1. Where is the interaction entry point of the application? 2. How can AI be integrated with existing business logic.
The interaction entry point that all AI platforms provide for users is a dialogue box similar to a chat window, obviously everyone thinks the interaction with AI applications should be a "humanized" way. And the smart part of the AI Agent is that it directly connects to all open IMs and social systems, which is obviously easier to accept than creating a new one.
How can AI be integrated with existing business logic? The solution provided by the AI Agent is to let developers integrate AI decision-making into business scenarios. Programming languages require certainty, the conditions of if can only be true or false, unable to handle fuzzy business logic. But through AI, complex logic can be converted into precise conditions, and then seamlessly integrated into the business scenario.
For example, the function of replying to messages in a group, the traditional IM Bot needs to be triggered by some clear message instructions, but through AI, a method shouldReplyMessage can be implemented, give it the context, and it returns true or false.
The role of AI in business logic scenarios is mainly:
1. "Intent" discovery: Through the instructions in the prompt, let the AI discover the "intent" in the user's text message based on the context, and map the intent to specific code.
2. Assist decision-making: By having AI convert fuzzy complex conditions into definite true/false or enumeration types, and then integrate them into the business logic.
Seeing this, many people may be disappointed with the AI Agent, as many people thought the AI Agent would just be taught a little and it would be able to do anything. In fact, due to the difficulty of the context limitation of large models, it is not possible (at least currently) to create an omnipotent AI that can do anything. But the good news is that programmers don't have to worry about losing their jobs, as there are still a lot of programmers behind AI, and people still need to stack if-else, but the key difference is that the business boundaries that programs can handle are expanding.
Two types of AI Agents
At the event, I asked shaw a question, the market has two expectations for AI Agents, 1. AI Agent plays a role itself, has its own ID, brand, and provides services to users. 2. Users have a personal AI Agent, which is like a personal assistant, and can help users handle some business. Which type of AI Agent will be more popular? He felt that both directions will be good, and they may even be combined.
What is mainly being explored in the market now is the first direction. This direction is similar to the AI Agent-ization of services, and in the future there may be no App interfaces, but everything will be AI Agent-ized and humanized. The second direction is the Agent-ization of application clients, and in the future, application clients will be a plugin of the assistant Agent, and the application's local data will become part of the Agent's memory base, and this plugin will also be responsible for communicating with the cloud service Agent. This is a new application architecture model that will change the entire infrastructure.
Infrastructure requirements for AI Agents
1. The infrastructure needs to achieve a zero-barrier entry (Permissionless), otherwise the AI Agent will be restricted by various anti-attack strategies, and the service should use economical means (Gas) to prevent attacks. Platforms with relatively poor openness will face relatively large shocks, and the open platform fever in the early days of Web2 will be reignited.
2. AI Agents need to be able to operate funds to pay for this.
In other words, future services, whether or not they are based on blockchain, need to support Crypto-based private key authentication and Crypto-based payments.
The integration of AI Agents and blockchains
In addition to the two points mentioned above, how AI Agents integrate with blockchains is a direction that everyone is exploring. At the event, I chatted with Mikkke about the focEliza he is working on. The first type of AI Agent mentioned earlier at least needs the running or verification environment provided by the blockchain. Because once an AI Agent provides services to the outside world, there will be trust issues, and the role it plays is actually the same as a smart contract.
There was a controversy over the name "smart contract" in the past, it's just a piece of code, where is it "smart"? AI can make smart contracts truly live up to their name. The difficulty is how to call AI interfaces in a smart contract environment. If letting large models run in a verifiable environment is still a relatively distant path, using a solution similar to Oracle is a more viable path.
And around the AI Agent, there will be a lot of derived needs, such as how to obtain the public knowledge of the AI Agent, how the AI Agent judges facts, how the AI Agent identifies the same user on different platforms, how the "memory" in the smart contract is stored, and if I have multiple devices with AI Agents installed, how they share memory?
You will find that the "data on-chain", "relationship on-chain", "DID", "P2P network" and other things that have been done in Web3 all have new meanings and scenarios.
Conclusion
Reusing the conclusion of my sharing on AI and blockchain in 2021, a more AI-friendly Internet is also a more human-friendly Internet. At that time it was just a brain fart, but now the future has arrived.