AI Agent’s “GPT moment”, the user’s “universal hand” Manus is born!

PANews

03-06

This article is machine translated

Show original

Here is the English translation of the text, with the specified terms preserved:

Title: The "GPT Moment" of the AI Agent, Manus Awakens the Entire AI Circle!

Author: shiyun Zhang Yongyi

Editor: Jingyu

2025 is the first year of the AI Agent - this statement was verified in the early morning of March 6th, Beijing time.

"After DeepSeek, another sleepless night in the tech circle."

Many users commented on social media like this.

Everyone stayed up all night, just for an invitation code to use the product - it is the world's first AI Agent product "Manus" developed by Monica.im.

According to the team, "Manus" is a truly autonomous AI agent that can solve various complex and changing tasks. Unlike traditional AI assistants, Manus not only can provide suggestions or answers, but also can directly deliver complete task results.

The introduction video of Manus is only 4 minutes long, but it is incredibly powerful | Image source: Monica.im

As the name "Manus" implies, it symbolizes "hand" in Latin. That is to say, knowledge not only needs to be in the brain, but also needs to be executable by hand. This is the essential evolution of Agent and AI Bot (chatbot) products.

Where is Manus outstanding? The most intuitive is to look at the official website display and the user-generated use cases, which Geek Park has partially summarized as follows:

Travel planning: Not only integrating travel information, but also creating customized travel manuals for users. For example, planning a trip to Japan in April and providing personalized travel recommendations and detailed manuals.
Stock analysis: Conducting in-depth stock analysis, designing visually appealing dashboards to display comprehensive stock insights. For example, conducting in-depth analysis of Tesla stock and creating a visualization dashboard.
Educational content creation: Creating video demonstration materials for middle school teachers to explain complex concepts like the momentum theorem, helping teachers teach more effectively.
Insurance policy comparison: Creating clear insurance policy comparison tables, providing the best decision recommendations, and helping users choose the most suitable insurance products.
Supplier procurement: Conducting in-depth research across the entire network to find the most suitable suppliers for user needs, serving as a truly fair agent for users.
Financial report analysis: Capturing market sentiment changes for specific companies (such as Amazon) through research and data analysis, providing market sentiment analysis for the past four quarters.
Startup company list organization: Visiting relevant websites to identify qualified companies and organizing them into a table. For example, compiling a list of all B2B companies in the YC W25 batch.
Online store operations analysis: Analyzing Amazon store sales data, providing actionable insights, detailed visualizations, and customized strategies to help improve sales performance.
When the Agent outputs an extremely complete and professional result through a long chain of thinking and tool invocation, users begin to exclaim "It can really help humans do things".

According to the official website information, in the GAIA benchmark test (evaluating the ability of general AI assistants to solve real-world problems), Manus achieved new state-of-the-art (SOTA) performance at all three difficulty levels.

In summary - what Manus wants to do more is to be your "agent" in the digital world, and it has achieved this.

Just as you thought, the launch of Manus in the early morning has immediately awakened all the people in the AI circle!

01. Manus, Your "Digital Agent"

First, the biggest difference between Manus and previous LLMs in terms of experience:

It emphasizes the ability to directly deliver the final result, rather than just providing a simple "answer".

Manus currently uses a Multiple Agent architecture, with a running mode similar to the Computer Use released by Anthropic, running completely in an independent virtual machine. It can also call various tools in the virtual environment - write and execute code, browse web pages, operate applications, etc., and directly deliver complete results.

In the official release video, three cases of Manus' work in actual use scenarios were introduced:

The first task is to screen resumes.

From 15 resumes, it recommends suitable candidates for the reinforcement learning algorithm engineer position and ranks the candidates based on their reinforcement learning expertise.

In this demonstration, you don't even need to unzip the compressed file and manually upload each resume file. Manus has already shown the "intern" side like a human, manually unzipping the file and browsing each resume page, while recording the important information.

Manus, like an intern, automatically understands the hidden instruction "unzip the packed file the boss threw over" | Image source: Geek Park

In the results provided by Manus, not only are there automatically generated ranking recommendations, it also categorizes the candidates into different levels based on important dimensions such as work experience. After receiving the user's preference to present the results in an Excel spreadsheet, Manus can also automatically generate the corresponding spreadsheet by writing a Python script on the spot.

Manus can even remember "the user prefers to receive the results in a spreadsheet format" during this process, and will prioritize using the spreadsheet form to present similar task results next time.

Manus can remember the user's preferences in the content generation process | Image source: Geek Park

The second case, more tailored for Chinese users, is the selection of real estate.

In the case, the user wants to buy a property in New York, with requirements for a safe community environment, low crime rate, and high-quality primary and secondary schools - of course, including the most important budget that can be afforded monthly.

In this requirement, Manus AI breaks down the complex task into a to-do list, including researching safe communities, identifying quality schools, calculating budgets, searching for properties, etc. It carefully reads articles about the safest communities in New York through web searches and collects relevant information.

Next, Manus writes a Python program to calculate the affordable property budget based on the user's income. Combining the relevant property price information from real estate websites, it filters the property list within the budget range.

Manus can automatically search and filter out properties that do not meet the user's requirements | Image source: Geek Park

Finally, Manus will integrate all the collected information and write a detailed report, including community safety analysis, school quality assessment, budget analysis, recommended property list, and relevant resource links - just like a professional real estate agent. And because Manus has the inherent attribute of "completely considering the user's interests", its use and experience are even better.

In the last case, Manus demonstrates its stock price analysis capabilities.

The task given is to analyze the correlation between the stock prices of Nvidia, Marvell Technology, and TSMC over the past three years: it is well known that these three stocks are closely related, but it is difficult for new users to quickly sort out the causal relationship.

Manus' operation is very similar to that of a real stock broker. It first accesses information websites like Yahoo Finance to obtain historical stock data, and also cross-checks the accuracy of the data to avoid being misled by a single information source, which could have a significant impact on the final result.

In this case, Manus also used the ability to write Python code, perform data analysis and visualization, and introduced professional financial tools for analysis. Finally, it provided feedback on the causal relationship through data visualization charts and a detailed comprehensive analysis report - just like the daily work of a "intern" in the financial field.

Furthermore, the Manus website showcases more than ten scenarios where Manus can be used: directly using Manus to help you organize your itinerary, personalize your travel route recommendations, and even let it learn to use various complex tools to streamline your daily work.

Here is the English translation of the text, with the specified terms preserved:

In this process, what truly sets Manus apart from previous tools is its autonomous planning to ensure the ability to execute tasks.

The ability to self-learn also makes Manus' work capabilities more akin to real human intelligence - even if at this stage it may not be able to reach expert-level proficiency in a specific field, its vast potential is already evident.

With the addition of self-learning capabilities, the versatility of the AI Agent has been greatly enhanced. In user testing of Manus, you can even directly describe the relevant content in a video scene, and Manus can then accurately find the link to a specific Douyin short video by overcoming the limitations of search engines within the platform.

Since the current version of Manus is fully cloud-based and asynchronous, its capabilities are not limited by the form factor or computing power of the end-user platform - users can even turn off their computers after issuing instructions to Manus, and Manus will automatically notify them of the results upon completion.

This operational logic is also very familiar - just like an employee who, after work, asks an intern on WeChat to "organize the files and send them to me". The only difference is that this intern is now truly available 24/7 and won't "organize the workplace".

02. Multi-agent + self-verification, running the AI Agent flow

From the above cases, it is not difficult to see that the real trump card of Manus is not the "AI Agent" concept that has already appeared in Computer Use, but its ability to "simulate human work methods".

Compared to "running calculations", Manus' work logic is more akin to "thinking and executing commands". It has not achieved what humans are currently truly unable to do; this is why some users who have already experienced the current version of Manus describe it as "an intern".

On the Manus website, there are numerous tasks that Manus can complete, including a case demonstrating how to use Manus in B2B business. Quickly and accurately matching your ordering needs with global suppliers.

In similar product requirements, the industry-standard logic is to integrate global supply chain enterprise information on the platform to help users match suppliers/demand. But in the Manus case, you can see a completely different implementation approach.

Manus AI uses an architecture called "Multiple Agent", running in independent virtual machines. Through the division of labor and collaboration mechanism of planning agents, execution agents, and verification agents, it significantly improves the processing efficiency of complex tasks and shortens response times through parallel computing.

In this architecture, each agent may be based on an independent language model or reinforcement learning model, communicating with each other through APIs or message queues. Each task also runs in a sandbox to avoid interfering with other tasks, while supporting cloud-based scaling. Each independent model can mimic the process of how humans handle tasks, such as thinking and planning, understanding complex instructions and breaking them down into executable steps, and then calling the appropriate tools.

In other words, through Manus' multi-agent architecture, it is more like having multiple assistants, who cooperatively complete tasks such as resource retrieval, interfacing, and verifying information validity, to help you complete the entire workflow - this is not only like you've hired an "intern", but more like you've become a miniaturized "department manager".

In the B2B business case, Manus, through web crawling, code writing, and execution capabilities, automatically searches the vast ocean of the internet to match the most suitable suppliers for you based on your needs, in terms of product quality, price, delivery capabilities, and more. Not only can it present the conclusions to you in a visual way, but it can also provide more detailed operational recommendations based on the data.

Manus may be more useful than the built-in tools of a single platform in fulfilling B2B scenarios | Image source: Geek Park

As for how the Monica team achieved the video effect and what technology they used, according to the news, the team may unveil it on March 6th, Beijing time.

03. The ultimate "stitching" is an explosion

What kind of company is Monica.im, the company behind Manus?

Monica is an All-in-One AI assistant, with product forms ranging from browser extensions to gradually expanding to apps and web pages. The mainstream usage scenario is that when users click on its small icon in the browser, they can directly use the various mainstream models it has integrated. By accurately understanding the needs of niche user scenarios, Monica has reaped the "low-hanging fruit" of large models.

Its founder, Xiao Hong (nicknamed Xiao Hong, English name Red), is a young serial entrepreneur born in 1992 and graduated from Huazhong University of Science and Technology. After graduating in 2015, he started a business, which was not very successful in the early stage (such as campus social networking, second-hand market). In 2016, he started a business providing editing and data analysis tools for WeChat public account operators, which gained millions of users and became profitable, and the final product was sold to a unicorn company in 2020.

Until the large model wave in 2022, he officially founded Monica, focusing on the overseas market, and quickly completed the cold start through the independent developer product ChatGPT for Google.

In 2024, at the same time as the launch of GPT-4o, Claude 3.5, and the OpenAI o1 series, Monica allowed users to access the latest SOTA models. With the integration of new models, Monica's professional search, DIY Bot, Artifacts mini-program development, and memory functions have also been well-received by users. Monica presents different interactive forms and functions on web pages such as YouTube, Twitter, Gmail, and The Information, to adapt to the specific needs of users in different scenarios, updating the personalized AI experience of hundreds of web pages.

In 2024, Monica's user base doubled to 10 million. At the same time, it maintains considerable profitability and ranks among the top in overseas similar products.

Monica's strong performance verifies one thing:

Encapsulating to the extreme is both TPF and PMF, and ultimately leads to user value.

Monica homepage | Image source: Monica

Manus may have continued the Monica team's approach - in an interview with media person Zhang Xiaojun, Xiao Hong said that products cannot be limited to just chatbots, and Agent will be a new form that requires new products to support.

He was inspired by the AI programming products Cursor and Devin. According to Geek Park, the former is mainly in copilot mode, while the latter is more in autopilot mode, which is more in line with human needs. The Agent should also be like Devin, aimed at the general public, and truly led by AI for execution. But the problem in the past was that the models were not smart enough.

However, based on the current capabilities of the models, encapsulating them into scenario-based services may be the advantage of the Monica team. Xiao Hong said that there are not many Agent product teams at the moment, because it requires a lot of composite capabilities, such as the team having experience in chatbots, AI programming, browser-related (because they all run on the browser), and a good perception of the boundaries of the models - what is the current level of development, and what will the future development be like.

"There are not many companies that have these capabilities, and the companies that have these capabilities may be busy with a very specific business, but we happen to have classmates who have time to do this together," he said.

Why is it Monica that has done it, he summarized, "First, I think we are quite lucky. Second, to some extent, if everyone goes to do reasoning now, maybe there will be more time for startups? How far can the external spillover of model predictive capabilities go?"

He believes that Agent is still in the early stage at the moment. One is that Agent is still in the planning stage and has not yet reached physical world execution; the other is that the capabilities of large models are still developing, and everything is unpredictable.

"I certainly don't know how Agent can be brought out in this way, it is an unknown thing," he said.

Interestingly, the "not knowing how to do Agent" Monica has now produced a product that has shocked the entire AI circle.

Manus may not be the ultimate AI Agent, but it has undoubtedly raised people's expectations for AI to a new level after the DeepSeek explosion.

*Header image source: Monica.im

Source

Disclaimer: The content above is only the author's opinion which does not represent any position of Followin, and is not intended as, and shall not be understood or construed as, investment advice from Followin.

Add to Favorites

Comments

Relevant content

ME News

Breaking News! The Year of China's RWA: A Compliant Channel Opens for Trillions of Yuan in Domestic Assets to Go Global

BlockTempo

Arthur Hayes speculates that the reason for the BTC crash is "institutional hedging operations": IBIT options saw a surge of $900 million.

BTC

1.67%

The Defiant

Bitcoin Selloff Sparks Hedge Fund Speculation Around BlackRock ETF

BTC

1.67%