The intersection of AI and DePIN.

avatar
PANews
07-31
This article is machine translated
Show original

Written by Geng Kai and Eric, DFG

introduction

As of 2023, both AI and DePIN are hot trends in Web3, with AI having a market cap of $30 billion and DePIN having a market cap of $23 billion . These two categories are very large, and each covers a variety of different protocols that serve different areas and needs and should be covered separately. However, this article aims to discuss the intersection between the two and examine the development of protocols in this field.

The intersection of AI and DePIN

In the AI ​​technology stack, the DePIN network provides practicality for AI through computing resources. The growth of large technology companies has led to a shortage of GPUs , which has resulted in other developers who are building their own AI models lacking sufficient GPUs for computing. This often leads developers to choose centralized cloud providers, but this leads to inefficiencies due to having to sign inflexible, long-term, high-performance hardware contracts.

DePIN essentially provides a more flexible and cost-effective alternative that uses token rewards to incentivize resource contributions that align with network goals. DePIN in AI crowdsources GPU resources from individual owners to data centers, forming a unified supply for users who need access to hardware. These DePIN networks not only provide customizability and on-demand access to developers who need computing power, but also provide additional income for GPU owners who may find it difficult to profit from idleness.

With so many AI DePIN networks on the market, it can be difficult to identify the differences between them and find the right network you need. In the next section, we will explore what each protocol does and what they are trying to achieve, as well as some specific highlights of what they have achieved.

AI DePIN Network Overview

Each of the projects mentioned here has a similar purpose - GPU computing market network. The purpose of this section of the article is to examine the highlights of each project, their market focus, and what they have achieved. By first understanding their key infrastructure and products, we can gain insight into the differences between them, which will be introduced in the next section.

Render is a pioneer in P2P networks that provide GPU computing power. It previously focused on rendering graphics for content creation, and later expanded its scope to include AI computing tasks from Neural Reflex Fields (NeRF) to generative AI through the integration of toolsets such as Stable Diffusion.

The intersection of AI and DePIN

Interesting points :

  1. Founded by OTOY, the cloud graphics company with Oscar-winning technology

  2. GPU Networks are used by big names in the entertainment industry, including Paramount Pictures, PUBG, Star Trek, etc.

  3. Partnering with Stability AI and Endeavor to integrate their AI models with 3D content rendering workflows using Render’s GPU

  4. Approve multiple computing clients and integrate more GPUs in the DePIN network

Akash calls itself "Airbnb for hosting" and positions itself as a " super cloud " alternative to traditional platforms such as AWS that support storage, GPU and CPU computing. Using developer-friendly tools such as the Akash container platform and Kubernetes-managed compute nodes , it can seamlessly deploy software across environments, allowing it to run any cloud-native application.

The intersection of AI and DePIN

Interesting points :

  1. Targets a wide range of computing tasks from general computing to web hosting

  2. AkashML allows its GPU network to run over 15,000 models on Hugging Face while integrating with Hugging Face

  3. Some notable applications hosted on Akash include Mistral AI’s LLM model chatbot , Stability AI’s SDXL text-to-image model, and Thumper AI’s new base model AT-1.

  4. Platforms building the Metaverse, AI deployment, and federated learning are leveraging Supercloud

io.net provides access to distributed GPU cloud clusters that are specialized for AI and ML use cases. It aggregates GPUs from data centers, crypto miners, and other decentralized networks. The company was previously a quantitative trading firm that pivoted to its current business after the price of high-performance GPUs increased significantly.

Interesting points :

  1. Its IO-SDK is compatible with frameworks such as PyTorch and Tensorflow, and its multi-layer architecture can automatically and dynamically scale according to computing needs

  2. Supports creation of 3 different types of clusters , which can be started within 2 minutes

  3. Strong collaborative efforts to integrate GPUs from other DePIN networks, including Render, Filecoin, Aethir, and Exabits

Gensyn provides GPU computing power focused on machine learning and deep learning computations. It claims to achieve a more efficient verification mechanism compared to existing approaches by combining concepts such as proof of learning for verification work, a graph-based pinpointing protocol for re-running verification work, and a Truebit-style incentive game involving staking and slashing of computation providers.

Interesting points :

  1. The estimated hourly cost of a V100 equivalent GPU is approximately $0.40/hour, resulting in significant cost savings

  2. Proof stacking allows pre-trained base models to be fine-tuned to accomplish more specific tasks

  3. These foundational models will be decentralized, globally owned, and provide additional capabilities beyond hardware computing networks.

Aethir is equipped with enterprise GPUs and focuses on computing-intensive fields, mainly artificial intelligence, machine learning (ML), cloud gaming, etc. The containers in its network act as virtual endpoints for executing cloud-based applications, transferring workloads from local devices to containers for a low-latency experience. To ensure high-quality services for users, they adjust resources by moving GPUs closer to data sources based on demand and location.

The intersection of AI and DePIN

Interesting points :

  1. In addition to artificial intelligence and cloud gaming, Aethir has also expanded into cloud phone services and partnered with APhone to launch a decentralized cloud smartphone.

  2. Extensive partnerships with major Web2 companies including NVIDIA, Super Micro, HPE, Foxconn and Well Link

  3. Multiple partners in Web3, such as CARV, Magic Eden, Sequence, Impossible Finance, etc.

Phala Network acts as the execution layer for Web3 AI solutions. Its blockchain is a trustless cloud computing solution that handles privacy issues by using its Trusted Execution Environment (TEE) design. Rather than serving as a computational layer for AI models, its execution layer enables AI agents to be controlled by smart contracts on the chain.

The intersection of AI and DePIN

Interesting points :

  1. Acts as a co-processor protocol for verifiable computation, while also enabling AI agents to use on-chain resources

  2. Its AI agent contracts can obtain top large-scale language models such as OpenAI, Llama, Claude and Hugging Face through Redpill

  3. In the future, multiple proof systems will be included, including zk-proofs, multi-party computing (MPC), and fully homomorphic encryption (FHE).

  4. Support H100 and other TEE GPUs in the future to improve computing power

Project Comparison

Render

Akash

io.net

Gensyn

Aethir

Phala

hardware

GPU & CPU

GPU & CPU

GPU & CPU

GPU

GPU

CPU

Business Focus

Graphics Rendering and AI

Cloud computing, rendering and AI

AI

AI

AI, cloud gaming and telecommunications

On-chain AI execution

AI Task Types

reasoning

Both

Both

train

train

implement

Job Pricing

Performance-Based Pricing

Reverse Auction

Market Pricing

Market Pricing

Bidding system

Equity calculation

Blockchain

Solana

Cosmos

Solana

Gensyn

Arbitrum

Polkadot

Data Privacy

Encryption & Hashing

mTLS authentication

data encryption

Security Map

encryption

TEE

Working expenses

0.5-5% per job

20% USDC,

4% AKT

2% USDC, 0.25%

Reserve Fees

Low cost

20% per session

Proportional to the pledge amount

Safety

Rendering Proof

Proof of Stake

Calculation Proof

Proof of Stake

Proof of Rendering Capability

Inherited from the relay chain

Proof of Completion

-

-

Timelock Proof

Learning prove

Rendering Proof of Work

TEE Attestation

quality assurance

dispute

-

-

Verifiers and Whistleblowers

Inspector Node

Remote Attestation

GPU Clusters

no

yes

yes

yes

yes

no

importance

Availability of clusters and parallel computing

The distributed computing framework implements GPU clusters, providing more efficient training without compromising model accuracy while enhancing scalability . Training more complex AI models requires powerful computing power, which often must rely on distributed computing to meet its needs. To put it in a more intuitive perspective, OpenAI's GPT-4 model has more than 1.8 trillion parameters and was trained using approximately 25,000 Nvidia A100 GPUs in 128 clusters in 3-4 months .

Previously, Render and Akash only offered single-purpose GPUs, which may have limited their market demand for GPUs. However, most major projects have now integrated clusters to achieve parallel computing. io.net has worked with other projects such as Render, Filecoin, and Aethir to incorporate more GPUs into its network and has successfully deployed more than 3,800 clusters in Q1'24 . Although Render does not support clusters, it works similarly to clusters, breaking down a single frame into multiple different nodes to process different ranges of frames simultaneously. Phala currently only supports CPUs, but allows CPU workers to be clustered.

Incorporating cluster frameworks into the AI ​​workflow network is important, but the number and type of cluster GPUs required to meet the needs of AI developers is a separate issue that we will discuss in a later section.

Data Privacy

Developing AI models requires the use of large datasets, which may come from a variety of sources and in various forms. Sensitive datasets such as personal medical records and user financial data may be at risk of being exposed to model providers. Samsung internally banned the use of ChatGPT due to concerns that uploading sensitive code to the platform would violate privacy, and Microsoft's 38TB private data leak further highlighted the importance of taking adequate security measures when using AI. Therefore, having a variety of data privacy methods is critical to returning data control to data providers.

Most of the projects covered use some form of data encryption to protect data privacy. Data encryption ensures that data transfer from data providers to model providers (data recipients) in the network is protected. Render uses encryption and hashing when publishing rendering results back to the network, while io.net and Gensyn employ some form of data encryption. Akash uses mTLS authentication to only allow tenants to receive data from providers they choose.

However, io.net recently launched fully homomorphic encryption (FHE) in collaboration with Mind Network, which allows encrypted data to be processed without first decrypting it . This innovation can ensure data privacy better than existing encryption technologies by enabling data to be securely transmitted for training purposes without revealing identities and data content.

The intersection of AI and DePIN

Phala Network introduces TEE, a secure area in the main processor of the connected device. Through this isolation mechanism, it prevents external processes from accessing or modifying data regardless of their permission level, even individuals with physical access to the machine. In addition to TEE, it also incorporates the use of zk-proofs in its zkDCAP validator and jtee command-line interface for integration with RiscZero zkVM.

Proof of Completion and Quality Check

The GPUs provided by these projects provide computing power for a range of services. Since these services range from rendering graphics to AI computations, the final quality of such tasks may not always meet the user's standards. A form of proof of completion can be used to indicate that the specific GPU rented by the user was indeed used to run the desired service, and that quality checks are beneficial to the user who requested such work to be completed.

Once the computation is complete, both Gensyn and Aethir generate proofs to show that the work was completed, with io.net's proof indicating that the performance of the rented GPU was fully utilized and no issues occurred. Both Gensyn and Aethir perform quality checks on the completed computations. For Gensyn, it uses validators to rerun parts of the generated proof to check against the proof, while whistleblowers act as another layer of checks on validators. Meanwhile, Aethir uses check nodes to determine the quality of service, penalizing subpar service. Render recommends using a dispute resolution process to slash a node if the review committee finds problems with the node. Phala generates a TEE proof when it is completed, ensuring that the AI ​​agent performed the required actions on the chain.

Hardware Statistics

Render

Akash

io.net

Gensyn

Aethir

Phala

Number of GPUs

5600

384

38177

-

40000+

-

Number of CPUs

114

14672

5433

-

-

30000+

H100/A100 quantity

-

157

2330

-

2000+

-

H100 fee/hour

-

$1.46

$1.19

-

-

-

A100 Fee/Hour

-

$1.37

$1.50

$0.55 (estimated)

$0.33 (estimated)

-

Requirements for high performance GPUs

Since AI model training requires the best performing GPUs, they tend to use GPUs like Nvidia’s A100 and H100, which offer the best quality despite the latter’s high price in the market. Seeing how the A100 is not only able to train all workloads, but also does it faster, it only shows how much the market values ​​this hardware. Since the H100 has 4x faster inference performance than the A100 , it is now the GPU of choice, especially for large companies that are training their own LLMs.

The intersection of AI and DePIN

The intersection of AI and DePIN

For decentralized GPU market providers to compete with their Web2 peers, it is not only important to offer lower prices, but also to meet the actual needs of the market. In 2023, Nvidia shipped more than 500,000 H100s to centralized large technology companies , making it costly and difficult to acquire as much equivalent hardware to compete with large cloud providers. Therefore, considering the amount of hardware these projects can bring to their networks at a low cost is important to expand these services to a larger customer base.

While each project has a presence in AI and ML computing, they differ in the capabilities they provide for computing. Akash has only more than 150 H100 and A100 units in total, while io.net and Aethir have received more than 2,000 units each. Typically, pre-training LLMs or generating models from scratch requires at least 248 to more than 2,000 GPUs in a cluster , so the latter two projects are more suitable for large-scale model computing.

Depending on the cluster size required by such developers, the cost of these decentralized GPU services on the market today is already much lower than centralized GPU services. Gensyn and Aethir both claim to be able to rent A100-equivalent hardware for less than $1 per hour, but this still needs to be proven over time.

Network-connected GPU clusters have a large number of GPUs and a lower hourly cost, but one problem they have is that they are memory-constrained compared to NVLink-connected GPUs. NVLink enables direct communication between multiple GPUs , without transferring data between the CPU and GPU, to achieve high bandwidth and low latency. Compared to network-connected GPUs, NVLink-connected GPUs are best suited for LLMS with many parameters and large datasets, as they require high performance and intensive computation.

Still, decentralized GPU networks offer powerful computing power and scalability for distributed computing tasks for users with dynamic workload needs or who need flexibility and the ability to distribute workloads across multiple nodes. By providing a more cost-effective alternative to centralized cloud or data providers, these networks open up oligopoly opportunities for building more AI and ML use cases, unlike centralized AI models.

Provide consumer-grade GPU/CPU

While GPUs are the primary processing unit required for rendering and computation, CPUs also play an important role in training AI models. CPUs can be used for multiple parts of training, including data preprocessing all the way to memory resource management , which is very useful for developers developing models. Consumer-grade GPUs can also be used for less intensive tasks, such as fine-tuning already pre-trained models or training smaller-scale models on smaller datasets at a more affordable cost .

While projects like Gensyn and Aethir are primarily focused on enterprise GPUs, other projects like Render, Akash, and io.net can also serve this part of the market, given that over 85% of consumer GPU resources are idle. Providing these options allows them to develop their own market niche, allowing them to focus on large-scale intensive computing, more general small-scale rendering, or a mix between the two.

in conclusion

The AI ​​DePIN field is still relatively new and faces its own challenges. Their solutions have been criticized for their feasibility and have encountered setbacks. For example, io.net was accused of faking GPU numbers on its network and later solved the problem by introducing a proof-of-work process to verify devices and prevent Sybil attacks.

Despite this, there has been a significant increase in the number of tasks and hardware executed in these decentralized GPU networks. The increasing volume of tasks executed on these networks highlights the growing demand for alternatives to Web2 cloud provider hardware resources. At the same time, the proliferation of hardware providers in these networks highlights previously underutilized supply. This trend further demonstrates the product-market fit of AI DePIN networks as they effectively address both demand and supply challenges.

Looking ahead, the trajectory of AI development points to a thriving multi-trillion dollar market, and we believe these decentralized GPU networks will play a key role in providing developers with cost-effective computing alternatives. By leveraging their networks to continually bridge the gap between demand and supply, these networks will make a significant contribution to the future landscape of AI and computing infrastructure.

Sector:
Source
Disclaimer: The content above is only the author's opinion which does not represent any position of Followin, and is not intended as, and shall not be understood or construed as, investment advice from Followin.
Like
Add to Favorites
Comments