Analyzing the potential and practical challenges of combining AI and encryption

This article is machine translated
Show original

Author: @ed_roman; Translated by: Vernacular Blockchain

Recently, artificial intelligence has become one of the hottest and most promising areas in the crypto market. These include:

  • Decentralized AI training

  • GPU Decentralized Physical Infrastructure Network

  • Uncensored AI Models Are these groundbreaking developments or just hype?

At @hack_vc, we’re working hard to clear the fog and separate promise from reality. In this article, we’ll take a deep dive into the top ideas in crypto and AI. Let’s explore the real challenges and opportunities.

1. Challenges of combining Web3 and AI

1. Decentralized AI training

The problem with AI training on the chain is that training requires high-speed communication and coordination between GPUs, because neural networks need to be back-propagated when training. Nvidia provides two innovative technologies for this purpose (NVLink and InfiniBand). These technologies can greatly speed up GPU communication, but they can only be used in GPU clusters within a single data center (speeds exceeding 50 Gbps).

If a decentralized network is introduced, the speed will be significantly slower due to the increased network latency and bandwidth. This is simply not feasible for AI training use cases compared to the high-speed interconnect that Nvidia provides in data centers. In addition, the network bandwidth and storage costs in a decentralized environment are much higher than SSDs in a local cluster.

Another problem with training AI models on-chain is that this market is less attractive than inference. Currently, a large amount of GPU computing resources are used for training AI large language models (LLMs). But in the long run, inference will become the main application scenario for GPUs. Think about it: how many AI large language models need to be trained to meet demand? In contrast, how many customers will use these models?

Note that there are already some innovations in this area that may offer hope for the future of on-chain AI training:

1) Distributed training over InfiniBand is being done at scale, and NVIDIA itself supports non-local distributed training through its collective communication library. However, this is still in its early stages, and adoption remains to be seen. The bottleneck caused by physical distance still exists, so local InfiniBand training is still significantly faster.

2) Some new research has been published that explores decentralized training with reduced communication synchronization, which may make decentralized training more practical in the future.

3) Intelligent sharding and training scheduling can help improve performance. Similarly, there may be new model architectures designed specifically for distributed infrastructure in the future (Gensyn is conducting research in these areas).

4) Innovations such as Neuromesh attempt to achieve distributed training at a lower cost through a new approach called predictive coding networks (PCNs).

2. Decentralized AI data iteration

The data information part of training is also a challenge. Any AI training process involves processing large amounts of data. Typically, models are trained on centralized and secure data storage systems that are highly scalable and performant. This requires transferring and processing terabytes of data, and it is not a one-time cycle. Data is often noisy and contains errors, so before training the model, the data must be cleaned and transformed into a usable format. This stage involves repetitive tasks of standardization, filtering, and handling missing values. In a decentralized environment, these pose serious challenges.

The data information part of training is also iterative, which is not very compatible with Web3. It took OpenAI thousands of iterations to achieve their results. The training process is iterative: if the current model does not achieve the expected results, experts will return to the data collection or model training stage to improve the results. Now, imagine doing this process in a decentralized environment, where the best existing frameworks and tools are not easily available in Web3.

One promising technology is 0g.ai (backed by Hack VC), they provide on-chain data storage and data availability infrastructure. They have a faster architecture and the ability to store large amounts of data on-chain.

3. Reaching consensus using overly redundant AI reasoning calculations

One challenge of combining encryption with AI is verifying the accuracy of AI reasoning, because you cannot fully trust a single centralized party to perform reasoning operations, and there is a possibility that nodes may misbehave. In Web2 AI, this challenge does not exist because there is no decentralized consensus system.

One solution is redundant computing, where multiple nodes repeat the same AI reasoning operations in order to operate in a trustless environment and avoid single points of failure.

The problem with this approach is that we live in a world where there is a severe shortage of high-end AI chips. The waiting period for high-end NVIDIA chips is years long, driving up prices. If you also require AI inference to be repeated multiple times on multiple nodes, this will significantly increase these expensive costs. For many projects, this is not feasible.

4. Web3-specific AI use cases (short term)

Some have suggested that Web3 should have its own unique AI use cases, specifically for Web3 clients.

Currently, this is still an emerging market and use cases are still being discovered. Some challenges include:

  • Web3 native use cases require less AI transactions because market demand is still in its infancy.

  • Fewer customers, because there are orders of magnitude fewer Web3 customers than Web2 customers, so the market is less fragmented.

  • The customers themselves are not stable enough because they are startups with less funding, so these startups may go out of business over time. AI service providers targeting Web3 customers may need to reacquire some of their customers over time to replace those that go out of business, making it more difficult to expand their business.

In the long term, we are very bullish on Web3-native AI use cases, especially as AI agents become more prevalent. We envision a future where every Web3 user has multiple AI agents assisting them. An early leader in this space is Theoriq.ai, who are building a platform of composable AI agents that can serve both Web2 and Web3 clients (backed by Hack VC).

5. Consumer GPU Decentralized Physical Infrastructure Network (DePIN)

There are many decentralized AI computing networks that rely on consumer-grade GPUs rather than GPUs in data centers. Consumer-grade GPUs are suitable for low-end AI inference tasks or consumer use cases with flexible latency, throughput, and reliability requirements. But for serious enterprise use cases (i.e., those that occupy the majority of the market share), customers expect the network to be more reliable than their home machines, and complex inference tasks generally require higher-end GPUs. For these more valuable customer use cases, data centers are more suitable.

It is important to note that we believe that consumer-grade GPUs are suitable for demonstration purposes or individuals and startups that can tolerate lower reliability. However, these customers are generally of lower value, so we believe that decentralized physical infrastructure networks (DePIN) for Web2 enterprises will be more valuable in the long run. Therefore, well-known GPU DePIN projects have generally evolved from using mainly consumer-grade hardware in the early days to now having A100/H100 and cluster-level availability.

2. Practical and feasible use cases of encryption x AI

Now, let’s discuss use cases where crypto x AI can significantly add value.

Actual benefit 1: Serving Web2 customers

McKinsey estimates that generative AI could deliver between $2.6 trillion and $4.4 trillion in added value per year for the 63 use cases they analysed – by comparison, the UK’s GDP in 2021 was $3.1 trillion. This would increase the impact of all AI by 15% to 40%. This estimate would roughly double if we embedded generative AI into software currently used for other tasks.

Interestingly:

  • Based on the above estimates, this means that the total market value of global AI (not just generative AI) could reach tens of trillions of dollars.

  • By comparison, the total value of all cryptocurrencies (including Bitcoin and all Altcoin) combined is only about $2.7 trillion today.

So, let’s be realistic: the vast majority of customers who will need AI in the short term will be Web2 customers, as the number of Web3 customers who actually need AI is only a small fraction of this $2.7 trillion market (considering that BTC has half of the market share, and BTC itself doesn’t need/use AI).

Web3 AI use cases are just getting started, and it is not clear how big the market will be. But one thing is intuitively certain - it will only be a part of the Web2 market for the foreseeable future. We believe that Web3 AI still has a bright future, but this means that the most common application of Web3 AI is still serving Web2 customers.

Examples of Web2 clients that could benefit from Web3 AI include:

  • Vertical industry software companies built from the ground up with AI at their core (e.g. Cedar.ai or Observe.ai)

  • Large enterprises that fine-tune models for their own purposes (e.g. Netflix)

  • Fast-growing AI providers (e.g. Anthropic)

  • Software companies that add AI capabilities to existing products (e.g. Canva)

This is a relatively stable customer base because these customers are generally large and high value. They are unlikely to go out of business in the short term and represent a very large potential customer base for AI services. Web3 AI services that serve Web2 customers will benefit from this stable customer base.

But why would a Web2 client want to use the Web3 stack? The rest of this article will explain this rationale.

Real benefit 2: Reducing GPU usage costs through GPU Decentralized Physical Infrastructure Network (GPU DePIN)

GPU DePINs aggregates underutilized GPU computing power (most of which comes from data centers) and makes these resources available for AI inference. Think of it simply as "Airbnb for GPUs" (i.e. collaborative consumption of underutilized assets).

The reason we are excited about GPU DePINs is as mentioned above, primarily because there are many GPU cycles that are currently wasted due to the NVIDIA chip shortage, which could be used for AI inference. These hardware owners have incurred sunk costs and are currently not fully utilizing their equipment, so these partial GPU cycles can be provided at a lower cost than the status quo, as it is effectively a "windfall" for the hardware owner.

Specific examples include:

1) AWS machines: If you rent an H100 from AWS today, you need to commit to rent it for at least one year because the market supply is tight. This will lead to waste because you are unlikely to use your GPU 365 days a week, 7 days a week.

2) Filecoin mining hardware: The Filecoin network has a large subsidized supply, but not much actual demand. Unfortunately, Filecoin never found true product-market fit, so Filecoin miners are in danger of going bankrupt. These machines are equipped with GPUs and can be repurposed for low-end AI inference tasks.

3) ETH Mining Hardware: When ETH moved from Proof of Work (PoW) to Proof of Stake (PoS), a large amount of hardware immediately became available, which can be repurposed for AI inference.

The GPU DePIN market is highly competitive with multiple players offering products. For example, Aethir, Exabits, and Akash. Hack VC chose to support io.net, which also aggregates supply through partnerships with other GPU DePINs, so they currently support the largest GPU supply on the market.

It is important to note that not all GPU hardware is suitable for AI inference. One obvious reason is that older GPUs do not have enough GPU memory to handle large language models (LLMs), although there have been some interesting innovations in this regard. For example, Exabits has developed technology to load active neurons into GPU memory and inactive neurons into CPU memory. They predict which neurons need to be active/inactive. This makes it possible to use low-end GPUs for AI workloads even when GPU memory is limited. This actually improves the practicality of low-end GPUs for AI inference.

Additionally, Web3 AI DePINs will need to strengthen their products over time, providing enterprise-grade services such as single sign-on (SSO), SOC 2 compliance, service level agreements (SLAs), etc. This will be comparable to the cloud services currently enjoyed by Web2 customers.

Real Advantage #3: Non-censored models that avoid OpenAI self-censorship

There has been a lot of discussion about the issue of censorship of AI. For example, Turkey temporarily banned OpenAI at one point (they later lifted the ban after OpenAI improved its compliance). We believe that this kind of national-level censorship is not fundamentally a cause for concern, as countries need to embrace AI to remain competitive.

Even more interesting is that OpenAI will self-censor. For example, OpenAI will not process NSFW (Not Safe for Workplace Viewing) content, nor will it predict the outcome of the next presidential election. We think there is an interesting and large market for AI applications that OpenAI is unwilling to engage in for political reasons.

Open source is a great way to solve this problem, as a Github repository is not beholden to shareholders or a board of directors. An example is Venice.ai, which promises to protect user privacy and operate in a non-censorship manner. Of course, the key is its open source nature, which makes this possible. Web3 AI can effectively improve this by running these open source software (OSS) models on low-cost GPU clusters for inference. Because of this, we believe OSS + Web3 is an ideal combination to pave the way for non-censorship AI.

Real Benefit #4: Avoid sending personally identifiable information to OpenAI

Many large enterprises have privacy concerns about their internal corporate data. For these customers, it is difficult to trust a centralized third party like OpenAI to handle this data.

For these businesses, using web3 may seem more scary because their internal data is suddenly available on a decentralized network. However, for AI, there are already some innovations in privacy-enhancing technologies:

Trusted Execution Environments (TEEs) such as the Super Protocol

Fully homomorphic encryption (FHE) such as Fhenix.io (a portfolio company of funds managed by Hack VC) or Inco Network (both powered by Zama.ai) and Bagel’s PPML

These technologies are still evolving, and performance is improving with upcoming zero-knowledge (ZK) and FHE ASICs. But the long-term goal is to protect enterprise data while fine-tuning models. As these protocols emerge, web3 may become a more attractive place for privacy-preserving AI computing.

Real Benefit #5: Leverage the latest innovations in open source models

Over the past few decades, open source software (OSS) has been eroding proprietary software’s market share. We view LLM as an advanced proprietary software that is slowly becoming a disruptor to open source software. Some notable challengers include Llama, RWKV, and Mistral.ai. This list will undoubtedly grow over time (a more comprehensive list is available at Openrouter.ai). By leveraging web3 AI powered by an open source model, people can take advantage of these new innovations.

We believe that over time, an open source global development effort, combined with crypto incentives, can drive rapid innovation in open source models and the agents and frameworks built on top of them. An example of an AI agent protocol is Theoriq. Theoriq leverages the open source model to create a composable, interconnected network of AI agents that can be assembled together to create more advanced AI solutions.

The reason we believe this is because of past experience: most "developer software" has been surpassed by open source over time. There's a reason Microsoft used to be a proprietary software company and is now the most contributed Github company. If you look at how Databricks, PostGresSQL, MongoDB, etc. have disrupted proprietary databases, the entire industry is an example of being disrupted by open source software, so the precedent is pretty strong here.

However, there is a small catch. One thorny issue with OSS LLMs is that OpenAI has begun signing paid data licensing agreements with organizations, such as Reddit and the New York Times. If this trend continues, it may become increasingly difficult for OSS LLMs to compete due to the economic barriers to accessing data. Nvidia may use confidential computing as a strengthening tool for secure data sharing. Time will tell how this develops.

Real benefit #6: consensus via expensive random sampling or zero-knowledge proofs

Verification is a challenge in web3 AI reasoning. It is possible for a validator to cheat on the results to earn fees, so verifying reasoning is an important measure. It is important to note that although AI reasoning is still in its infancy, such cheating is inevitable unless measures are taken to weaken the motivation for such behavior.

The standard web3 approach is to have multiple validators repeat the same operation and compare the results. However, as mentioned earlier, AI inference is very expensive due to the current shortage of high-end Nvidia chips. Given that web3 can provide lower-cost inference through underutilized GPU DePINs, redundant computation will severely weaken web3's value proposition.

A more promising solution is zero-knowledge proofs for off-chain AI inference computations. In this case, concise zero-knowledge proofs can be verified to determine if a model was trained correctly, or if inference was run correctly (called zkML). Examples include Modulus Labs and ZKonduit. Since zero-knowledge operations require considerable computational resources, the performance of these solutions is still in its infancy. However, this may improve as zero-knowledge hardware ASICs are launched in the near future.

A more promising idea is an approach to AI reasoning based on "optimistic" sampling. In this model, you only need to verify a small fraction of the results generated by the validator, but set the economic cost high enough to punish validators who are caught cheating, so as to have a strong economic prohibitive effect. This way, you can save redundant computation (see, for example, Hyperbolic's "Proof of Sampling" paper).

Another promising idea is to use watermarking and fingerprinting solutions, such as the one proposed by Bagel Network, which is similar to the mechanism Amazon Alexa uses to ensure the quality of its AI models on millions of devices.

Real Benefit #7: Savings through a composable open source software stack (OpenAI’s profits)

The next opportunity web3 brings to AI is the democratization of cost reduction. So far, we have discussed ways to save GPU costs through DePINs like io.net. However, web3 also provides opportunities to save profit margins on centralized web2 AI services such as OpenAI, which has over $1 billion in annual revenue as of this writing. These cost savings come from using open source software (OSS) models instead of proprietary models, which enables additional cost savings because the model creators are not trying to make a profit.

Many open source models will always be completely free, which provides the best economic benefits to customers. However, there may be some open source models that try these monetization methods. Consider that only 4% of the models on Hugging Face are trained by companies with budgets to help subsidize these models (see here). The remaining 96% of the models are trained by the community. This 96% of the Hugging Face model population faces real costs (both computational and data costs). So these models need to be monetized somehow.

There are many proposals for monetizing this open source software model. One of the most interesting is the concept of an “Initial Model Offering” (IMO), where the model itself is tokenized, leaving a portion of the tokens to the team and some of the model’s future revenues flowing to token holders, although there are obviously some legal and regulatory hurdles to this.

Other OSS models will attempt to monetize based on usage. It is important to note that if this scenario becomes a reality, OSS models may start to look more and more like their web2 profit generating counterparts. However, from a realistic perspective, the market will bifurcate and some of these models will be completely free.

Once you’ve chosen an open source software model, you can layer composable operations on top of it. For example, you can use Ritual.net for AI reasoning, and Theoriq.ai as an early leader in composable and autonomous on-chain AI agents (both backed by Hack VC).

Real Benefit #8: Decentralized Data Collection

One of the biggest challenges facing AI is getting the right data suitable for training models. We mentioned earlier that there are some challenges with decentralized AI training. But what about leveraging the decentralized web to get data that can then be used for training elsewhere, even on traditional web2 platforms?

This is exactly what startups like Grass (backed by Hack VC) are doing. Grass is a decentralized “data scraping” network where individuals contribute their machines’ idle processing power to fetch data for training AI models. In theory, at scale, this data collection could be superior to any one company’s in-house efforts because of the massive network of incentivized nodes with massive computing power. This involves not only fetching more data, but also fetching it more frequently so that it’s more relevant and up-to-date. Since these data scraping nodes are decentralized in nature and not owned by a single IP address, it’s nearly impossible to stop this decentralized army of data scrapers. In addition, they have a network of humans who clean and normalize the data to make it useful after it’s been scraped.

Once you have the data, you also need an on-chain storage location, as well as the LLMs (Large Language Models) generated using that data. 0g.AI is an early leader in this regard. It is a high-performance web3 storage solution optimized for AI that is much cheaper than AWS (which is another economic success for Web3 AI) while also serving as data availability infrastructure for second-layer, AI, etc.

It is important to note that in the future, the role of data in web3 AI may change. Currently, for LLM, the status quo is to pre-train the model using data and improve it over time with more data. However, because data on the Internet changes in real time, these models are always slightly out of date, so the response of LLM reasoning is slightly inaccurate.

A new paradigm that may develop in the future is "live" data. The concept is that when the LLM is asked to make inferences, the LLM can use data by injecting it with data collected from the internet in real time. This way, the LLM will use the latest data. Grass is also working on this.

Conclusion

We hope this analysis has been helpful as you think about the promise and reality of web3 AI. This is just a starting point for discussion, and the field is changing rapidly, so feel free to join in and express your views, as we continue to learn and build together.

Source
Disclaimer: The content above is only the author's opinion which does not represent any position of Followin, and is not intended as, and shall not be understood or construed as, investment advice from Followin.
Like
Add to Favorites
Comments