Multi-agent systems: current situation and prospects

This article is machine translated
Show original

Author: Jinming Source: HashKey Capital Translation: Shan Ouba, Jinse Finance

introduction

The concept of artificial intelligence agent (AI agent), which refers to an intelligent software system that can understand its environment and autonomously perform actions on behalf of users or machines to achieve their goals, was proposed as early as the 1980s. However, it was not until the 2010s that the concept began to gain attention with the rise of deep learning and large language models (LLMs), which demonstrated their ability to understand and generate human-like responses.

Today, LLMs have become an integral part of our lives, with products like ChatGPT having over 15.5 million paying users worldwide, and demand is set to grow further as OpenAI rolls out smarter reasoning models. The widespread adoption of LLMs like ChatGPT, Claude, and DeepSeek has paved the way for the natural evolution of the agent economy. An agent is more complex than an LLM, and is defined as a system consisting of a single model or multiple models, and a framework with a toolset that defines the agent’s identity (Figure 1).

Equipped with roles and toolkits, agents can receive tasks, analyze, process, and autonomously perform actions on behalf of users, although sometimes human involvement is required to provide feedback and learn through reinforcement learning. Agents are inherently composable, and as agents become more specialized and technically mature, the human involvement part of the agent system may take a backseat, and communication between agents will become the focus of simplifying complex workflows and unlocking efficiency gains. As agent-based frameworks continue to advance, we expect to gain exponential benefits in a variety of applications by integrating blockchain, a technology based on transparency, decentralization, and incentive alignment.

Furthermore, by leveraging the trusted, secure, and transparent features of blockchain technology, agents on smart contracts can perform autonomous wallet transactions, earn token incentives for good behavior, and be punished for adversarial behavior. In this report, we will first explore what multi-agent systems are and the orchestration frameworks that support the development of these systems, and then understand the synergy between multi-agent systems and Web3 technologies. Subsequently, we will explore the use cases, challenges, and efforts to solve problems of Web3 multi-agent frameworks.

Figure 1: Components of a proxy

vMcZSzrQ192037hldIuF2oOG6i2SdhKsrb9HJE8L.png

Multi-agent system

In a multi-agent system, unlike a single-agent system, agents can focus on their respective areas and collaborate to simulate human teamwork and effectively solve multi-step, complex real-world problems (Figure 2). This enhances the cognitive and reasoning capabilities of agents based on a single LLM, providing greater scalability and efficiency. In a single LLM-based agent, the agent bears the arduous burden of completing the task from start to finish, which often leads to delays and bottlenecks when the task becomes more complex and demanding.

In a multi-agent system, there is usually a task manager that defines the task requirements, breaks down the task into smaller tasks, and delegates subtasks to agents based on their capabilities, making multi-agent systems more resilient and applicable to large-scale enterprise use cases. The collaborative nature of multi-agent systems facilitates efficient memory management by having each agent store only the context relevant to its role. Due to its distributed architecture, agents avoid handling large memory loads, thereby improving scalability and opening the door to a wider range of use cases.

The key to the development of multi-agent systems lies in multi-agent frameworks that enable agents to communicate and coordinate with each other effectively to achieve a given goal. Through various multi-agent frameworks, multi-agent reinforcement learning (MARL), simulated environments, and improved agent orchestration layers, they open up exciting opportunities for agent-driven applications across various industries, including the crypto industry. Below, we will examine some multi-agent orchestration frameworks in Web2 and Web3 that unlock new possibilities through agent-driven workflows.

Multi-agent orchestration framework

The multi-agent orchestration framework handles the management of LLM-based agents to solve problems. Multi-agent systems play an important role in simplifying and improving efficiency when automating complex tasks compared to a single agent.

Figure 2: Multi-agent framework architecture

n9WAEJwJ2AJiBLG7vjTfgiBjXHty3x2OyBCrEVf1.png

Please note that this is not an exhaustive list as multi-agent frameworks are constantly evolving.

AutoGen

AutoGen is an open source multi-agent framework designed by Microsoft Research AI Frontier Lab. It facilitates the development of multi-agent applications with its modular and extensible design. AutoGen Core implements messaging and event-driven agents that can be programmed in Python and .NET languages. The AgentChat API enables seamless communication between agents and is built on top of the Core API. Various extensions are available that enable agents to perform various functions such as web browsing, video analysis, file analysis, and encapsulating Langchain tools. MagenticOne, built on the AutoGen multi-agent framework, is capable of tasks such as executing code, browsing the web, and managing files.

CrewAI

CrewAI is an open source multi-agent platform that enables efficient and seamless task automation through well-defined role-based multi-agent orchestration. Its architecture allows agents with configurable roles, goals, and personalities to interact sequentially or in parallel, ensuring orderly task execution. To stay relevant, agents can leverage an extensive knowledge base that supports text sources and structured data formats. CrewAI also provides access to LangChain and LlamaIndex tools, as well as enterprise-level capabilities provided by Portkey, enabling agents to easily use external APIs, databases, and retrieval systems. The platform is also developer-friendly and supports YAML-based configuration, which makes it easy for developers to configure and deploy agents.

Langroid

Langroid is an open source Python programming framework that takes multi-agent programming as its core design principle, giving agents a status similar to that of citizens. Recognized by developers for its simplicity, intuitiveness, and extensibility, the framework provides a variety of modules and tools that can meet the needs of complex agent applications. By default, agents act as message converters and have 3 responder methods: LLM responder, agent responder, and user responder. Together, these responder methods allow agents to perform functions, generate human-readable natural language responses, and incorporate human feedback into their agent workflows. Encapsulating tasks around agents enables them to orchestrate interactions by delegating subtasks to other agents. Supporting OpenAI LLM and LLM function calls through the ToolMessage mechanism, agents can access a variety of tools and functions. Combined with integration with vector stores such as LanceDB, Qdrant, and Chroma, Langroid's agents have persistent dialogue states and vector storage memory, making them good at managing complex dynamic scenes.

CAMEL

CAMEL is an open-source multi-agent framework that provides a common infrastructure for a wide range of applications such as task automation, data generation, and real-world simulation. As part of CAMEL, the social module plays a vital role in multi-agent coordination. It contains two frameworks - RolePlaying and BabyAGI - designed to manage agent interactions and drive goal-oriented outcomes. Its role-playing, conversation-oriented approach makes it well suited for building customer-facing agents. CAMEL's integration with various vector databases and LLMs supports RAGs and provides persistent memory for its agents, making it well suited for large-scale enterprise applications. However, the success of the RolePlaying framework currently requires developers to have effective prompt engineering skills and role design, which may make it less friendly to those who do not have a strong coding and AI background. CAMEL has deployed an AI chatbot, Eigent Bot, which can obtain real-time information, support multimodal capabilities, and leverage graphical RAGs for better contextual understanding.

MetaGPT

MetaGPT is a metaprogramming multi-agent orchestration framework that encodes standard operating procedures (SOPs) as prompt sequences combined with clearly defined agent roles and responsibilities. This design helps mitigate the risk of more complex hallucinations from inter-agent interactions. Agents in MetaGPT communicate via a defined output format to a shared message pool, rather than engaging in one-to-one conversations, which reduces irrelevant or lost content. It also implements executable feedback mechanisms to support self-correction and review. MetaGPT is particularly effective in software development environments where clearly defined roles can improve code quality and task allocation. When measured against code generation benchmarks, MetaGPT achieves significant results in HumanEval and MBPP, 85.9% and 87.7%, respectively.

LangGraph

LangGraph is an open source agent framework developed by the creators of LangChain. Designed to manage complex multi-agent workflows, it features a modular architecture that enables different agents to communicate, coordinate, and perform tasks efficiently. By using a graph-based architecture to model the relationships between different components of an agent workflow, LangGraph facilitates dynamic task allocation, seamless scalability, and powerful problem-solving capabilities across distributed systems. This innovative approach simplifies state management for multi-step workflows that require persistent context. In addition, the Langchain Model Context Protocol (MCP) Adapter, a lightweight wrapper, allows MCP tools to be easily converted to Langchain tools for use by LangGraph agents, expanding their available toolset. In the multi-agent space, LangGraph benefits from a strong network effect as it leverages the LangChain ecosystem.

ElizaOS

ElizaOS, perhaps the most well-known Web3 multi-agent framework, is an open-source TypeScript multi-agent framework that embeds Web3 components to address entry barriers and accessibility issues in the crypto industry. The framework is modular in design with an extensive set of plugins, and is currently able to support a range of models (i.e. OpenAI, DeepSeek, Llama, Qwen, etc.), platform integrations (i.e. Twitter, Discord, Telegram, Farcaster, etc.), and over 25 chain compatibilities (i.e. Solana, Ethereum, Ton, Aptos, Sui, Sei, etc.). Its integration with the GOAT SDK also enables agents to perform various on-chain operations. The core architecture of ElizaOS consists of agents, role files, providers, actions, and evaluators, which together enable agents to have persistent memory and context awareness when performing various tasks, and get feedback from the evaluator to ensure better performance.

One notable example is the ai16z DAO Fund, which used the ElizaOS framework to create an autonomous agent that could filter market signals and trade various meme coins. At its peak, it managed over $36 million in AUM.

As the most mature agent framework in Web3, ElizaOS Agent Framework continues to gain popularity among Web3 developers as it has gained over 14K github stars and currently has 99 integrations. With the planned launch of an agent launchpad in the future, this could further stimulate interest among developers by providing them with a no/low-code agent launchpad.

RIG

Another popular Web3 agent framework with over 3K github stars is RIG, an open source Rust-based agent framework that stands out by providing a lightweight core while supporting advanced reasoning patterns (from prompt chains to conditional logic and parallel task execution). The RIG framework provides a unified API across supported LLM providers (OpenAI, cohere, DeepSeek, etc.) and provides simplified embedding and vector storage support for RAG implementations. Custom tools can also be created to make the framework extensible for LLM-based applications.

Leveraging Rust's asynchronous capabilities, multi-agent systems can process multiple tasks concurrently. Although it currently lags behind ElizaOS in terms of 23 Web3 native integrations. ARC, the developer behind RIG, has partnered with the Solana Foundation to drive adoption of the framework by providing targeted grants to developers who use RIG to build Rust-based agents. In addition, ARC has launched its agent launch platform Forge, which adopts a similar launch platform model as Virtuals, but currently only allows whitelisted teams to access the platform. A notable use case of the RIG and Forge launch platform is the AskJimmy platform, a multi-agent hedge fund that coordinates a group of agents driven by a trading strategy library to seamlessly execute trades across EVM and Solana on leading platforms such as Hyperliquid, Drift, GMX, etc.

GAME

The GAME framework developed by the Virtuals Protocol team is an open-source multi-agent framework based on Python and JavaScript that facilitates the creation of on-chain agents. Its integration with the Web3 library GOAT SDK provides agents with more than 200 on-chain operations across various protocols. Task processing is done through a hierarchical approach, where a task planner breaks down tasks into subtasks and delegates them to specialized working agents that coordinate and communicate to deliver the final output. Currently, most of its agents revolve around social media platforms and in-game environments, with the most famous agent being AIXBT. Since its launch, AIXBT, an AI-driven on-chain analytical influencer with his own X account, has been widely recognized for his analytical insights, with over 490,000 followers as of the time of writing.

4xckpxm76AnjfFxiiXHqrlJ4iLIB2X6k38jK7axc.png

Source: Virtuals Protocol GAME Architecture

uAgents

uAgents is a Python-based multi-agent framework developed by Fetch.AI, which has been integrated with various Web2 frameworks such as LangChain, Vertex AI, CrewAI, etc., making it easy to create and deploy autonomous agents on the Fetch.AI blockchain. Once created, the agent will be registered on the Almanac smart contract, allowing other agents to easily query the contract and identify the recipient agent by its agent address and HTTP endpoint. Cryptographic security ensures that the interactions between agents remain secure, allowing the most appropriate agent to fulfill user requests without compromising security.

Comparative Analysis (Web2 Framework and Web3 Framework)

Mz0KFxgF7Uhi48XBMPFHCSquhamLxBP6aEUwgwOn.png

Wk6B4ARCqwo3q8yRn2dzsINeGYc7QvCbEqiZyRpV.png

iMgyz7yT3dGr0fMJEFZiDKs2QybZtpZBq5GFomVf.png

KKNIcUKqwfxmxxBZhHJICUgnkohXeabkarmMlfoG.png

2IkqET3MjN8pgsdQVKKTWTWmBNwFVyWelpueNiAq.png

Advantages of Web3 Multi-Agent Framework

While Web2 multi-agent frameworks are relatively mature and have garnered strong institutional demand, they lack native on-chain functionality compared to Web3 multi-agent frameworks. Developers using Web2 tools must attach third-party libraries to interact with smart contracts or parse blockchain data, introducing complexity and potential vulnerabilities. Developers using Web3 multi-agent frameworks can benefit from the built-in on-chain functionality provided by these frameworks, providing a more seamless experience when deploying on-chain agents, as they can focus more on designing a good front-end user experience. Additionally, by leveraging blockchain and smart contracts as the underlying infrastructure, on-chain agents can benefit from cryptographic rails, such as having their wallets perform on-chain actions on behalf of users and ensuring incentive consistency.

Performance Metrics of Web3 Multi-Agent Framework

9NYzZKd7Mw43xlacfvpG9sAP3erstYd5Ri6WqBGr.png

Simplifying workflows in Web3

Despite the growing maturity and popularity of Web2 agent frameworks, the agent concept did not gain traction in Web3 until Q4 2024. Major players such as ElizaOS, Virtuals Protocol, and RIG (each with their own tokens) have achieved significant market capitalizations, highlighting the strong demand for AI agents in Web3 that are not just speculative trading. The excitement reflected in these token market capitalizations is not unfounded, as Web3 is still struggling to achieve mainstream adoption. Having agents on the blockchain autonomously perform on-chain actions has huge potential to transform the user experience. In addition to the efficiencies that can be achieved, the problem of agents in Web3 can be traced back to similar arguments for AI in blockchain, namely transparency and traceability, as well as advanced security features. Agent transactions are recorded on the blockchain, and users can easily track and verify the actions taken by agents. Below, we highlight some of the key areas that are most suitable for agent adoption.

DeFAI

On-chain transactions are inherently complex and require users to have at least a basic understanding of blockchains and Web3 wallets. This creates a poor user experience and remains a significant barrier for non-crypto native users. Although social login has recently been widely adopted by various Web3 wallet providers, the development of account and chain abstractions has remained slow and limited. Users still need to understand concepts such as gas fees, wallet addresses, bridges, etc. when navigating the DeFi landscape. In contrast, OpenAI’s recently launched Operator Agent only requires users to perform simple natural language processing to execute transactions, abstracting away multiple steps that users must take through backend proxy processing. Web3 should not be any different, and we believe that integrating AI agents with various DeFi (DeFAI) protocols can facilitate easier user onboarding and a seamless experience.

Virtuals Protocol recently launched the Agent Commerce Protocol, which sets a standardized approach for how agents communicate and interact with each other. This approach introduces a 4-stage process involving request, negotiation, transaction, and evaluation. The introduction of evaluators, smart contract-based escrow, and cryptographic verification are core features of the framework that ensure that the delivered transaction meets the requirements of the task. Once all requirements are met, smart contract triggers unlock the funds and deliver the service, ensuring that transactions can be conducted transparently and trustlessly. The Agent Commerce Protocol is just one example of how a multi-agent orchestration framework can help drive agent interactions on-chain in a trustless and secure manner.

Olas Protocol demonstrates a practical application of DeFAI: its Pearl app store contains Mobius and Optimus agents, which use the Olas stack to automate DeFi strategies on platforms such as Uniswap, Balancer, and Sturdy, covering networks such as Optimism, Base, and Mode. Olas Protocol's Mech market also acts as an agent tool and plugin exchange center, allowing deployed agents to outsource tasks through inter-agent communication. Another notable example is Questflow, which also proposes a multi-agent orchestration framework for intent matching. Users' requests are handled by an orchestrator that identifies relevant agents and delegates agents to these tasks through a task manager that oversees the execution of agent workflows. Since agents are dispatched in the Deagent agent registry, agent creators can also receive fair rewards.

Data Ownership

Amid the vast proxy landscape and the large amount of on-chain data generated, on-chain analytics is becoming an increasingly valuable area, with many projects seeking to provide data labeling services (e.g. Sahara AI), tracking (Arkham Intelligence, Kaito), proof registries (EAS, BAS, etc.). As a powerful assistant to users, proxies can contribute to the growing data landscape in Web3 by obtaining users' permission, allowing users to receive fair rewards for their data contributions.

game

There is a growing interest and demand for AI-powered agents in the Web3 gaming community. Game agents can power non-player characters (NPCs) or manage in-game economies. They help create dynamic, responsive environments by autonomously performing tasks and responding to player actions. Notable projects in this space include Parallel’s WayFinder platform, which is building a knowledge graph that can be used by AI agents in different agent workflows in games. Treasure DAO is another notable example, which recently announced the upcoming launch of the MAGE agent launch platform powered by ElizaOS, taking a further step towards an agent-driven Web3 gaming landscape. Virtuals Protocol also launched Project WestWorld, an interactive simulation in Roblox where multiple agents autonomously interact and drive dynamic game narratives powered by the GAME framework.

Other Use Cases

  • AI-driven DAOs: Proxies can distill lengthy proposals into digestible messages that mainstream users can easily understand and vote on, thereby enhancing the core ethos of decentralization.

  • Smart contract auditing, network analysis, fraud detection: Agents can play a vital role in debugging, often identifying potential risks faster than humans, thereby reducing security risks when combined with human intelligence.

  • Supply chain optimization: This can streamline and enable more cost-effective operations by using the predictive power of AI and the transparency and security features of blockchain.

Challenges and Efforts for Mature Web3 Multi-Agent Systems

Multi-agent systems (MAS) in the Web3 environment, where agents run on decentralized infrastructure and are often coordinated using smart contracts, face several limitations and challenges that can affect their design, deployment, and performance. Here are some of the obstacles that Web2 and Web3 agents may face:

  • Like systems based on a single LLM, multi-agent systems are also subject to the risk of model hallucination. The risk of hallucination in multi-agent systems can be more severe when hallucinations are passed from one agent to another, exacerbating the problem. Poorly managed communication between agents will lead to suboptimal performance. Therefore, as we move towards fully autonomous agents in the future, many frameworks will still require some human supervision.

  • Achieve consensus and state synchronization between agents. In a multi-agent system, in order to successfully complete a task, an agent must navigate a complex and hierarchical multi-agent system, ensuring consistency with the overall task, its own responsibilities, and multi-agent communication.

  • Proxies in Web3 also face scalability and latency issues because they run on the underlying blockchain and therefore compete for block space with other types of transactions. This may mean that we will not see full on-chain orchestration of large proxy networks in the foreseeable future until blockchain scalability challenges are solved. Security and privacy challenges on blockchains are also unique in the Web3 environment, which adds to the complexity. However, this is slowly being solved with the emergence of emerging solutions such as Turnkey, which provides a TEE solution (AWS Nitro Enclaves) in which proxies can perform actions securely and verifiably. Phala Network also announced a partnership with GoPlus to enhance the ElizaOS proxy using Phala’s TEE capabilities and GoPlus security features.

  • Multi-agent memory management. In a multi-agent system, different agents perform different tasks and store different information. Therefore, to ensure the successful delivery of the overall goal, reaching a consensus on information is helpful, while implementing a strong access control mechanism is crucial because some agents may be processing highly sensitive information. Failure to implement strong security measures may lead to data privacy leakage and task execution failure.

  • The lack of comprehensive benchmarks and evaluation standards in certain areas (e.g., scientific laboratory experiments, economics modeling, and on-chain skills) may hinder the rapid growth of the field.

in conclusion

The future of multi-agent frameworks is promising, but also challenging, highlighting the long road ahead. Compared to established and institutionally recognized Web2 multi-agent frameworks, Web3 multi-agent frameworks are still in their relative infancy with narrow production-ready use cases. Nonetheless, regulatory shifts and ongoing efforts to mitigate the aforementioned challenges are key catalysts for further adoption.

Additionally, the growth of agent development tools (e.g. SendAI Suite, Coinbase Agent Suite, ShellAgent No-Code Platform, Olas Stack, etc.) to simplify agent creation and expand the use cases of agents continues to make progress, driving growth and new innovation for developers. Advances in Web3 libraries such as the GOAT SDK help expand the possibilities of operations that agents can implement. Ultimately, as the technology evolves and these systems mature, we can expect agent workflows to become commonplace in on-chain interactions. Just as there are many Web2 multi-agent frameworks, we expect to see more agent frameworks in Web3 that provide both general and niche approaches.

Source
Disclaimer: The content above is only the author's opinion which does not represent any position of Followin, and is not intended as, and shall not be understood or construed as, investment advice from Followin.
Like
Add to Favorites
1
Comments