The Current State and Future of AI Agents

This article is machine translated
Show original

Author: jolestar Source: X, @jolestar

Last week, I tinkered with an AI Agent, and the day before yesterday I attended an event hosted by ai16z in Beijing, wanting to see what an AI Agent can actually do now and think about what it might be able to do in the future.

The current state of AI Agents makes me think of that meme image of a person hidden inside a vending machine. Everyone has already imagined that AI Agents have started to have autonomous consciousness, but in reality, there is actually a developer hidden inside the AI Agent. (Here, everyone can imagine the scene, I tried to get AI to generate this image, but found that AI cannot understand "hidden")

The basic working mode of the AI Agent framework

The AI Agent framework currently plays the role of a binder, connecting clients (Twitter, Discord, Telegram, etc.) and various plugins (different chains, etc.), and the framework provides a basic library (memory storage, session isolation, context generation) to interface with various AI platform APIs.

How the AI Agent framework integrates with applications and business scenarios

Since the AI boom last year, various platforms and tools have emerged, and the key is to solve the problem of how AI integrates with applications. Some AI platforms have tried to provide plugins, some have built workflow models, and some traditional applications have embedded AI within the application. But the key here is: 1. Where is the application's interaction entry point? 2. How can AI be integrated with existing business logic.

The application interaction entry points provided by various AI platforms to users are all similar to a chat window dialogue box, obviously everyone thinks that the interaction with AI applications should be a "personified" way. And the smart part of the AI Agent is that it directly connects to all open IMs and social systems, which is obviously easier to accept than creating a new one.

How can AI be integrated with existing business logic? The solution provided by the AI Agent is to let developers embed AI decision-making into business scenarios. Programming languages require certainty, the conditions of if can only be true or false, and cannot handle fuzzy business logic. But through AI, complex logic can be converted into precise conditions, and then seamlessly integrated into the business scenario.

For example, the function of replying to messages in a group, the traditional IM Bot needs to be triggered by some clear message instructions, but through AI, a method shouldReplyMessage can be implemented, give it the context, and it will return true or false.

The main role of AI in business logic scenarios is:

1. "Intent" discovery: Through the instructions in the prompt, let the AI discover the "intent" in the user's text message based on the context, and map the intent to specific code.

2. Assist decision-making: By using AI to convert fuzzy complex conditions into definite true/false or enumeration types, and then integrate them into the business logic.

Seeing this, many people may be disappointed with the AI Agent, as many people thought the AI Agent would just be taught a little and it would be able to do anything. In fact, due to the difficulty of the context limitation of large models, it is not possible (at least currently) to create an omnipotent AI that can do anything. But the good news is that programmers don't have to worry about losing their jobs, as there are still a lot of programmers behind AI, and people still need to stack if-else, but the key difference is that the business boundaries that programs can handle are expanding.

Two types of AI Agents

At the event, I asked @shawmakesmagic a question, the market has two expectations for AI Agents, 1. AI Agent plays a role itself, has its own ID, brand, and provides services to users. 2. Users have a personal AI Agent, which is like a personal assistant to help users handle some business. Which type of AI Agent will be more popular? He felt that both directions will be good, and they may even be combined.

What is mainly being explored in the market now is the first direction. This direction is similar to the AI Agent-ization of services, and in the future there may be no app interfaces, but all apps will be AI Agent-ized and personified. The second direction is the Agent-ization of application clients, and in the future, application clients will be a plugin of the assistant Agent, with the application's local data becoming part of the Agent's memory base, and this plugin will also be responsible for communicating with the cloud service Agent. This is a new application architecture model that will change the entire infrastructure.

The infrastructure requirements for the AI Agent

1. The infrastructure needs to be permissionless, otherwise the AI Agent will be restricted by various anti-attack strategies, and the service should use an economical cost (Gas) to defend against attacks. Platforms with poorer openness will face greater impact, and the open platform fever of the early Web2 era will be reignited.

2. AI Agents need to be able to operate funds to pay for this.

In other words, future services, whether or not they are based on blockchain, will need to support crypto-based identity authentication and crypto-based payments.

The integration of AI Agents and blockchains

In addition to the two points mentioned above, how AI Agents integrate with blockchains is a direction that everyone is exploring. At the event, I chatted with @Mikkke_acc about the focEliza they are working on. The first type of AI Agent mentioned earlier at least needs the running or verification environment provided by the blockchain. Because once an AI Agent provides external services, there will be trust issues, and the role it plays is actually the same as a smart contract.

There was controversy over the name "smart contract" back then, as it is just a piece of code, where is the "intelligence"? AI can make smart contracts truly smart. The difficulty is how to call AI interfaces in a smart contract environment. If letting large models run in a verifiable environment is still a relatively distant path, using a solution similar to Oracle is a more feasible path.

And around the AI Agent, there will be a lot of derived demands, such as how the public knowledge of the AI Agent is obtained, how the AI Agent judges facts, how the AI Agent identifies the same user on different platforms, how the "memory" in the smart contract is stored, and if I have multiple devices with an AI Agent installed, how they share memory.

You will find that the "data on-chain", "relationship on-chain", "DID", "P2P network" and other things that have been done in Web3 all have new meanings and scenarios.

Conclusion

Reusing the conclusion of my 2021 sharing on AI and blockchain, a more AI-friendly Internet is also a more human-friendly Internet. At that time it was just a brain fart, but now the future has arrived.

Source
Disclaimer: The content above is only the author's opinion which does not represent any position of Followin, and is not intended as, and shall not be understood or construed as, investment advice from Followin.
Like
Add to Favorites
Comments