Manus ignites AGI's virtual fire. Can DeFAI follow suit?

This article is machine translated
Show original

P Spring Dream V Half-life P

Manus is not as impressive as DeepSeek V3/R1, but it is more of a fusion of MCP and Operator's technical hype.

After Deepseek released its open source 5+1 days, Manus connected the flag to the path to AGI in the world, didn't it?

After carefully observing the product details, people may have the wrong date for Manus, and October 22 of last year is just right for the release date, which was the good day for Anthropic Claude to release its computer use, in other words, the day when LLM jumped out of ChatBot and became an Agent wandering and exploring in cyberspace, it's just that OpenAI's Operator won't really be born until January 2025.

There are quite a few concepts, let's unpack them step by step and take a glimpse of what Manus is all about using the CoT (Chain of Thought) approach.

AI Awakening: The Path to Freedom

The path beyond the dialog box is paved by authorization.

The greatness of OpenAI lies not in GPT, the Transformer paradigm was invented by Google, the real innovation is to use Chat as the first entry point for human-computer interaction, we can understand it as an intelligent database that can generally answer any of your questions, but it emphasizes more on "resolving doubts" rather than "helping you resolve doubts", for example, you can ask ChatGPT how to treat a cold, GPT can list answers based on different situations, but it cannot make a specific diagnosis or order medicine.

In this sense, the value of DeepSeek lies in making the model smarter (DeepSeek V3) and strengthening the diagnostic capability (DeepSeek R1), able to determine whether it is a viral cold or a cold caused by the weather turning cold.

But AI still cannot help you buy medicine, the fully embodied GPT is still sealed in the dialog box, we hope to release it.

Computer Use was born in response to this, in terms of path design, it is similar to the simplest keyboard and mouse sprites, Apple Shortcuts and Apple Script in external form, that is, they all replace manual + keyboard, mouse (or screen clicks) operations, but the internal is different, you don't need to customize script rules, you just need to command Claude to perform the corresponding operations through dialogue.

At this point, AI can help you open the browser, enter the Meituan address, and search for cold medicine, but new problems will also arise, AI needs your Meituan account to locate the nearest pharmacy.

We need to give AI more permissions at the bottom layer.

Caption: Ideal workflow of Agent
Source: @zuoyeweb3

This is also the necessary move for Anthropic to release MCP (Model Context Protocol) and OpenAI to launch Operator, the internal optimization of LLM has reached a local optimum, now we need to get AI/LLM moving, LLM and LLM need to call each other, LLM and external APIs need to integrate, and LLM and humans also need further collaboration.

Let's talk briefly about MCP first, and there will be an article to explain it in detail later.

The value of MCP lies in the hope of building a universal API/SDK framework for the LLM era, MCP hopes to standardize the communication format between AI models and other applications, for example, Claude/OpenAI/DeepSeek all use the same format to call code completion or create Meituan drug purchase rules, so that no matter what model the user uses, Meituan only needs to configure the same interface.

This does not mean that OpenAI/DeepSeek or Meituan must follow Anthropic's specific rules, but they can refer to them when designing, just like ONNX (Open Neural Network Exchange), the proliferation of models naturally requires corresponding collaboration standards.

But no matter who is used, the user still needs to inform them of their own Meituan account password and give Alipay authorization, and take over the calling system to complete the positioning, ordering, and calling the courier process, and finally you need to go downstairs to the express cabinet to pick up the medicine, for now AI still cannot run errands for you, and embodied intelligent robots still need time.

The significance of DeepSeek is that under the premise of extremely low cost, LLM becomes smarter, and its Chinese reasoning ability far exceeds its peers, this is its great significance in technology and products, not to mention the open source model makes AI more down-to-earth.

This is where Manus' trick lies, Manus is not OpenAI's Operator, or following Anthropic's MCP rules, it is essentially reinventing the wheel.

Of course, Chinese people also need to make some achievements in model standards, not to repeat the old path of operating systems and chips, but this has little to do with the so-called AGI, because so far I haven't seen what the base model of Manus is, if it is a self-developed and smarter base model, that would indeed be good news.

DeFAI and AI Agent are still in progress

The opponent of the cross-chain bridge is not the chain abstraction, but the CEX; the enemy of the AI Agent is not the intelligent entity, but the wallet.

After Manus was hyped up by the media, internal test codes and namesake tokens, in the midst of true and false rumors, Web3 AI agents are also eager to try, Virtuals announced the integration of Enso Shortcuts, making it convenient for users to interact with one click, currently supporting 200 protocols.

The bright side is that Web3 AI Agents are starting to go beyond the model dispute and steadily move towards real user needs, but obviously, the old Web 2 problems will still exist, which protocol standard to support?

Take the cross-chain bridge as an example, LayerZero has basically become the de facto industry standard protocol after years of effort, but it still cannot connect all scenarios, the reason is that CEX, especially Binance, is the most convenient asset cross-chain bridge, and inter-chain message communication is not the current pain point.

The most important attempt direction of Web3 AI Agent is to establish the connection between users, itself and Uniswap / Hyperliquid, that is, the AI Agent needs to become the de facto intermediary, private key holder or custodian, otherwise the user experience cannot be as good as the existing infrastructure-derived wallet+DEX experience, let alone compete with CEX for market share.

Saying this does not negate the prospects of DeFAI, but points out its real obstacles - it is not the level of intelligence, but the problem of how to gain user trust, Manus needs to compete with MCP and Operator for the right to define standards, so DeFAI projects also need to have such awareness.

All AI Agent projects need to adhere to long-termism, constantly iterating and trying, in order to wait for their initial users, in fact, the opponent of DeFAI is the wallet product form, not other intelligent entities.

Just as the industry has two paradigms of custodial wallets and non-custodial wallets, the biggest problem facing AI Agents now is the lack of strategy and fund security, as mentioned earlier, and the strategy is that even if users dare to authorize the Agent, they still need to face the problem of strategy setting, in a word, can AI be trusted to manage the user's finances?

The model and framework dispute of Web3 AI Agent has not yet been decided, and the further optimization of the strategy has not yet been really put into practical use, Musk once envisioned Robotaxi is still on the way, when will the AI financial master enter every crypto wallet?

Conclusion

It must be emphasized that this article is not a negation of Manus, after all, the Workflow + Claude + Cursor is already good enough, a little more is also fine, if you don't eat the AI bubble, others will.

This article is also not a negation of Web3 AI Agent, after all, staying up late to watch the market + manage private keys + Safe without mistakes is also safe enough, and letting DeFAI PVP for you can also save the youth of staying up late.

The only thing is, don't fake it, your nose will get longer if you fake it.

Source
Disclaimer: The content above is only the author's opinion which does not represent any position of Followin, and is not intended as, and shall not be understood or construed as, investment advice from Followin.
Like
Add to Favorites
Comments