Lao Gao talks about Deepseek: It definitely did not copy ChatGPT, and bypassed Nvidia's Cuda platform through underlying technology

This article is machine translated
Show original

The Chinese AI startup DeepSeek recently released two large models, "DeepSeek-V3" and "DeepSeek-R1", which have caused a stir in Silicon Valley due to their low cost and performance comparable to OpenAI's models, potentially reshaping the AI large model landscape.

Lao Gao Discusses the Impact of DeepSeek

In this regard, YouTuber Lao Gao released a video titled "The Globally Disruptive DeepSeek Has Ignited a Smokeless War Between China and the US", discussing his views on DeepSeek.

Lao Gao pointed out that the success of DeepSeek lies not only in the performance of its AI models, which is comparable to ChatGPT, but also in its extremely low development cost. DeepSeek developed its top-tier model for only $5.6 million (the actual cost is currently a matter of much debate and may not be as low as this), about one-hundredth of OpenAI's cost, significantly enhancing DeepSeek's competitiveness in the AI field and triggering a re-evaluation of the cost and efficiency of AI development in the market.

Lao Gao believes that DeepSeek's biggest breakthrough is its open-source strategy, which is different from OpenAI's closed-source model. DeepSeek has made its AI models publicly available, allowing anyone to download and run them locally, even for commercial use. This measure has not only significantly lowered the threshold for enterprises and individuals to use AI, but also posed a huge challenge to companies like OpenAI that rely on closed-source models for profitability.

Did DeepSeek Copy ChatGPT?

Furthermore, Lao Gao stated that the accusation of DeepSeek copying ChatGPT is clearly unfounded, as ChatGPT uses a closed-source model, and its internal workings cannot be copied externally. In contrast, as an open-source project, all of DeepSeek's code and data are publicly transparent, and if there were any instances of copying, OpenAI would have been able to detect them long ago. This is similar to how some well-known operating systems are not easily imitated due to their closed-source nature.

Lao Gao described that the results of DeepSeek and ChatGPT may only share similarities in inspiration, rather than direct copying, as no one can see the "secret recipe" of the other.

Lao Gao also claimed that the success of DeepSeek lies in its ability to bypass NVIDIA's CUDA computing platform. CUDA has long been like a nuclear power plant, transforming the powerful computing capabilities of chips into stable and efficient computing resources, forming an insurmountable technical barrier.

CUDA is NVIDIA's parallel computing platform and programming model that allows developers to leverage NVIDIA GPUs for high-performance computing.

According to Lao Gao, DeepSeek can bypass the hardware limitations on training speed and no longer rely on CUDA, which means that regardless of which company's chips are used, as long as they can be connected to this technology, they have the opportunity to maximize computing performance without using CUDA, further undermining NVIDIA's monopoly position in the computing market.

Did DeepSeek Really Bypass the NVIDIA CUDA Framework?

However, according to experts who spoke to Blocktime, the arguments in Lao Gao's video are mistaken. DeepSeek is currently absolutely using NVIDIA GPUs for computation and is also using the CUDA platform, contrary to Lao Gao's claim of "bypassing CUDA".

Experts pointed out that Lao Gao may have misunderstood recent news. It has been reported that DeepSeek has been preparing for potential future bans, and even if it cannot use NVIDIA GPUs, it plans to use domestic Chinese GPUs as a computing source (there are also reports that China is still obtaining a large number of NVIDIA chips through gray-market means), but this does not mean that the company's current models are "bypassing CUDA" in operation.

Previously, Tom's Hardware reported that when DeepSeek was using NVIDIA H800 chips for training, some functions used NVIDIA's low-level hardware instructions PTX language, rather than the high-level programming language CUDA. Beijing University of Aeronautics and Astronautics Associate Professor Huang Lei analyzed that bypassing CUDA means that DeepSeek can directly develop new things based on the GPU's driver functions, thereby achieving more fine-grained operations.

Source
Disclaimer: The content above is only the author's opinion which does not represent any position of Followin, and is not intended as, and shall not be understood or construed as, investment advice from Followin.
Like
Add to Favorites
Comments
1