DeepSeek’s computing power is stuck, is AI research in universities facing a bottleneck?

This article is machine translated
Show original
Here is the English translation of the text, with the specified terms preserved:
Huawei partners with 15 universities to provide the strongest solution.

Source:New Intelligence

Image source: Generated by Boundless AI

The top 5 machine learning PhD programs in the US don't even have a single GPU capable of providing massive computing power?

In mid-2024, a post on reddit by a netizen immediately sparked a heated discussion in the community-

By the end of the year, a report in Nature further exposed the severe challenges faced by the academic community in GPU acquisition - researchers actually have to queue up to apply for the use of the school's GPU cluster.

Similarly, the severe shortage of GPUs in the laboratories of our universities is also very common. There have even been reports of universities requiring students to bring their own computing power to class, which is completely absurd.

It is clear that the "computing power" bottleneck has even turned AI itself into an extremely high-threshold course.

Shortage of AI talents and insufficient computing power

At the same time, the rapid development of large models and embodied intelligence is triggering a global talent shortage.

According to calculations by a professor at the University of Oxford, the proportion of job positions in the US requiring AI skills has increased fivefold.

Globally, the number of Tech-AI job positions has grown 9-fold, and the number of Broad-AI job positions has grown 11.3-fold.

During this period, the growth in Asia has been particularly remarkable.

Although universities around the world are trying to help students master key AI capabilities, as mentioned earlier, computing power has now become a "luxury".

To bridge this gap, the collaboration between enterprises and universities has become an important means.

Kunpeng Ascend Science and Education Innovation Incubation Center, opening up the layout of university scientific research

Fortunately, Huawei has already started to layout the establishment of a similar innovative system in our universities!

Now, Huawei has signed the "Kunpeng Ascend Science and Education Innovation Excellence Center" with 5 top universities: Peking University, Tsinghua University, Shanghai Jiao Tong University, Zhejiang University, and University of Science and Technology of China.

In addition, Huawei is also simultaneously promoting the "Kunpeng Ascend Science and Education Innovation Incubation Center" with 10 other universities, including Fudan University, Harbin Institute of Technology, Huazhong University of Science and Technology, Xi'an Jiaotong University, Nanjing University, Beihang University, Beijing Institute of Technology, University of Electronic Science and Technology of China, Southeast University, and Beijing University of Posts and Telecommunications.

The establishment of the Excellence Center and Incubation Center is a model of industry-education integration:

  • By introducing the Ascend ecosystem, it has made up for the shortage of computing power in universities, greatly promoting the release of more scientific research results;

  • By reforming the curriculum system, driven by research projects, industry projects, and competition projects, to cultivate top talents in the computing industry;

  • By tackling system architecture, computing acceleration capabilities, algorithm capabilities, and system capabilities, striving to nurture world-class innovative achievements;

  • By creating many "AI+X" interdisciplinary fields, leading the ecological development of intelligentization.

Building a fully autonomous domestic computing power for AI research

Now, the significance of AI for Science is self-evident.

According to the latest survey by Google DeepMind, one out of every three postdoctoral researchers uses large language models to assist in literature reviews, programming, and article writing.

This year's Nobel Prizes in Physics and Chemistry were also awarded to researchers in the field of AI.

It can be seen that in the process of empowering scientific research with AI, GPUs, with their outstanding performance in these fields that require high-performance computing, as well as their powerful capabilities for LLM training and inference, have become a precious "gold", sought after by companies like Microsoft, xAI, and OpenAI.

However, the US blockade on GPUs has made China's progress in AI and scientific research extremely difficult.

To cross this divide, we must build and develop a self-reliant and complete ecosystem.

At the computing power level, Huawei's Ascend series AI processors have taken on the task of reshaping China's competitiveness.

And above the computing power, we also need a self-developed computing framework to adapt, in order to fully unleash the advantages of the NPU/AI processors.

As is well known, the CUDA architecture designed specifically for NVIDIA GPUs is commonly used in the field of AI and data science.

The only real competitor and replacement in China is CANN.

As Huawei's heterogeneous computing architecture for AI scenarios, CANN supports mainstream AI frameworks such as PyTorch, TensorFlow, and Huawei's own MindSpore, and enables Ascend AI processors, which is the key platform for improving the computing efficiency of Ascend AI processors.

For this reason, CANN naturally has many technological advantages. The most critical ones are the deeper software-hardware integration optimization for AI computing and the more open software stack:

  • First, it can support multiple AI frameworks, including Huawei's own MindSpore, as well as third-party PyTorch and TensorFlow;

  • Second, it provides multi-level programming interfaces for diverse application scenarios, allowing users to quickly build AI applications and businesses based on the Ascend platform;

  • Moreover, it also provides model migration tools to help developers quickly migrate projects to the Ascend platform.

Currently, CANN has initially built its own ecosystem. At the technical level, CANN encompasses a large number of applications, tools, and libraries, with a complete technical ecosystem, providing users with a one-stop development experience. At the same time, the developer community based on Ascend technology is also gradually growing, laying a fertile ground for future technological applications and innovations.

On top of the heterogeneous computing architecture CANN, we also need a deep learning framework for AI model building.

Almost all AI developers need to use deep learning frameworks, and almost all DL algorithms and applications need to be implemented through deep learning frameworks.

Nowadays, there are well-known frameworks such as Google's TensorFlow and Meta's PyTorch, which have formed a huge ecosystem.

Entering the era of large model training, deep learning frameworks need to perform effective training when facing the scale of thousands of computers.

Huawei's MindSpore, an open-source full-scenario deep learning framework officially launched in March 2020, has filled the gap in this field in China, achieving true self-reliance and control.

MindSpore has key features such as cloud-edge-device full-scenario deployment, native support for large model training, and support for AI+scientific computing, building a native development environment with full-scenario collaboration and full-process simplicity, accelerating domestic scientific research innovation and industrial applications.

The special feature is that as the "best partner" of Ascend AI processors, MindSpore supports "terminal, edge, and cloud" full-scenario, and can achieve unified architecture, one-time training, and multi-point deployment.

From large-scale Earth system simulation and autonomous driving, to small-scale protein structure prediction, they can all be realized through Huawei's MindSpore.

Open-source deep learning frameworks can only unleash greater value with a wide developer ecosystem.

The 2023 report on the "China AI Framework Market Research" released by the research organization Omdia shows that MindSpore has already entered the top tier of AI framework usage, second only to TensorFlow.

In addition, inference applications in thousands of industries are the key to unleashing the value of AI. In the process of accelerating the development of GenAI, both universities and enterprises have an urgent need to solve the high demand for accelerating inference speed.

Here is the English translation of the text, with the specified terms preserved:

For example, the high-performance optimization compiler TensorRT is a powerful tool for boosting the inference performance of large models. With the help of quantization and sparsity, it can reduce the complexity of the model, thereby efficiently optimizing the inference speed of deep learning models. However, the problem is that it only supports NVIDIA's GPUs.

Similarly, just as we have computing architectures and deep learning frameworks, we will also have corresponding inference engines - Huawei's Ascend MindIE.

MindIE is an AI inference acceleration engine for all scenarios, integrating the most advanced inference acceleration technologies in the industry and inheriting the features of the open-source PyTorch.

Its design takes into account flexibility and practicality, allowing seamless integration with various mainstream AI frameworks, while supporting different types of Ascend AI processors, providing users with multi-level programming interfaces.

Through full-stack joint optimization and layered open AI capabilities, MindIE can unleash the ultimate computing power of Ascend hardware, providing users with efficient and fast deep learning inference solutions, solving the problems of high technical difficulty and multiple development steps in model inference and application development, improving model throughput performance, shortening application launch time, enabling hundreds of models and thousands of scenarios, and meeting diverse AI business needs.

It can be seen that self-innovated technologies such as CANN, MindSpore, and MindIE not only fill the gaps in domestic computing power, but also achieve leapfrog breakthroughs in model training, framework usability, inference performance, and even directly benchmark against advanced technology stacks abroad.

Building a World-Class Incubation Center

In addition to the technical advantages, it can be said that in the next few decades, the use of Ascend computing power will be more in line with the country's needs.

Only domestic self-developed computing power can avoid the impact of the changing external environment and ensure the stability of the scientific research foundation.

Now that the platform has been built, how can we teach teachers and students in universities to use it?

Since September 6th last year, Huawei has successively held the first Ascend AI special training camp at four major universities: Peking University, Shanghai Jiao Tong University, Zhejiang University, and the University of Science and Technology of China. Among the hundreds of students who registered and participated, 90% were postgraduate and doctoral students, and the courses covered various aspects of the Ascend field, including CANN, MindSpore, MindIE, MindSpeed, HPC, and Kunpeng development tools.

In the training camp, students not only can learn about the core technologies in detail, but also have the opportunity to practice hands-on. This arrangement is very much in line with the students' absorption characteristics of new knowledge, from shallow to deep, step by step.

For example, in the Shanghai Jiao Tong University session, the first day's course will focus on migration, allowing students to understand the Ascend AI basic software and hardware solution, PyTorch model Ascend native development case practice, MindIE inference solution features and migration cases, and so on.

The second day's course will focus on optimization, including the Ascend heterogeneous computing architecture CANN, Ascend C operator development, and large model long sequence inference optimization operations.

The setting of migration and optimization courses can be said to be far-sighted.

You should know that now many universities' practical courses are basically based on CUDA/X86 settings, but under the impact of sanctions, the problem of insufficient computing power is becoming more and more prominent. At this time, if you master the migration method, you can move the project to the Ascend platform, allowing academic research to continue.

After mastering the basic knowledge, students can start hands-on in the practical case part. Huawei's experts will guide the students step by step, allowing them to learn the Ascend technology stack and experience the full process of large model quantization, inference, and Codelabs code implementation.

After the hands-on practice, the students will have a deeper understanding of the Ascend ecosystem through personal experience, laying a solid foundation for their future work in the technical field.

Students are hands-on in the first special training camp at Shanghai Jiao Tong University.

In addition to the courses, Huawei will also hold an operator challenge for university developers to discover operator development experts.

The competition encourages developers to carry out in-depth innovation and practice based on Ascend computing power resources and the basic capabilities of CANN, accelerating the integration of AI and industry, and promoting the improvement of developers' abilities.

In addition, the incubation center also attaches great importance to academic achievements.

Students who conduct academic research based on Kunpeng or Ascend computing key technologies and tools can apply for postgraduate scholarships. During this period, if their papers are published in top international conferences and domestic top journals, they will also receive corresponding rewards.

At the same time, Huawei has also joined with Kunpeng & Ascend ecosystem partners to launch the Excellent Talent Program.

This program allows on-campus students to move from theory to practice, entering the real work scenarios of enterprises, while helping outstanding students to connect with enterprises in advance.

Now, the Excellent Talent Program has partnered with more than 200 companies in 15 cities, providing more than 2,000 technical positions, allowing more than 10,000 university students to be employed.

In general, through these teaching practices and incentive programs, the enthusiasm of students can be greatly increased. Not only can they improve their academic experience and achieve research results, but their experience and background will also be more outstanding, making it easier for them to gain the favor of top companies at home and abroad in the job market.

Then, after mastering the latest technologies and their applications, how can we cultivate truly groundbreaking research results in the ever-changing AI field today?

Since Sora sparked the AI craze for text-to-video in 2024, text-to-video large models have been emerging one after another. The open-source text-to-video project Open-Sora Plan by Peking University and Rabbit Show has caused a sensation in the industry.

In fact, even before Sora was released, the team had already been preparing to open-source the Sora version. However, the requirements for computing power and data could not be met, and the project was temporarily shelved. Fortunately, Peking University and Huawei jointly established the Kunpeng Ascend Science and Education Innovation Excellence Center, which quickly provided the team with computing power support.

The team originally used NVIDIA A100, and after migrating to the Ascend ecosystem, they made a series of surprising discoveries:

The support of CANN can realize efficient parallel computing, significantly accelerating the processing speed of large-scale datasets; the Ascend C interface library simplifies the development process of AI applications; the operator acceleration library further optimizes the algorithm performance.

More importantly, the open Ascend ecosystem can quickly adapt large models and applications.

Therefore, although the team members started from scratch in the Ascend ecosystem, they were able to quickly get up to speed in a short period of time.

In the subsequent training, the team continued to discover surprises: for example, when using torch_npu for development, the entire code can be seamlessly trained and inferred on the Ascend NPU.

When model partitioning is required, the Ascend MindSpeed distributed acceleration suite provides a rich set of large model distributed algorithms and parallel strategies.

Furthermore, in large-scale training, the stability of using MindSpeed and Ascend hardware is far higher than other computing platforms, allowing continuous operation for a week without interruption.

Therefore, in just one month, the Open-Sora Plan was officially launched and received great recognition in the industry.

The scene of "Black Myth: Wukong" generated by the Open-Sora Plan is comparable to a blockbuster movie, stunning countless netizens.

In addition, Southeast University has developed a multi-modal transportation large model MT-GPT based on Ascend computing power.

In the past, the deployment of transportation large models was extremely difficult, due to problems such as data silos caused by data collection by different government departments, inconsistent data formats and standards, and the heterogeneity and multi-source nature of transportation data.

To solve these problems, the team specifically designed a concept framework called MT-GPT (Multimodal Transportation Generative Pre-trained Transformer) for a multi-modal transportation large model, providing data-driven solutions for multi-dimensional and multi-granular decision-making problems in multi-modal transportation system tasks.

However, the development and training of large models undoubtedly requires an extremely high computing power foundation.

For this reason, the team chose to leverage the capabilities of Ascend AI to accelerate the development, training, optimization, and deployment of the transportation large model.

In the development stage, the Transformer large model development suite, through multi-source heterogeneous knowledge corpus and multi-modal feature encoding, collaboratively improved the understanding accuracy of multi-modal generative problems.

During the training stage, the Ascend MindSpeed distributed training acceleration suite provides multi-dimensional, multi-mode, and multi-modal acceleration algorithms for large traffic models.

In the optimization stage, the Ascend MindStudio full-process tool chain, combined with fine-tuning of domain-specific knowledge in the transportation field, conducts training fine-tuning.

In the deployment stage, the Ascend MindIE inference engine can help with one-stop inference of large traffic models, and also support cross-city migration analysis, development, debugging, and optimization.

In summary, it can be found that Peking University's Open-Sora is a transfer project to reproduce Sora, and as an open-source project, it can also better empower global developers to apply it in more scenarios.

The multi-modal traffic large model MT-GPT of Southeast University reflects the actual capability of Ascend computing power in achievement transformation, directly empowering the transportation industry in cities.

Thus, a closed loop of industry-academia-research has been fully formed.

These fruitful achievements also further prove that the Excellence Center/Incubation Center can not only provide a fertile ground for academic research and scientific innovation for universities, but also cultivate a large number of top AI talents, and then incubate scientific research results that lead the world.

For example, during the process of the Peking University team developing the Open-Sora Plan, Professor Yuan Li organizes students and the Huawei Ascend team to brainstorm on code and algorithm development every day.

In the process of feeling the stones to cross the river, the Peking University team's many students personally participated in a high-quality scientific research practice, demonstrating extremely high scientific research creativity.

This team with an average age of 23 has also become a mainstay force in promoting domestic AI video applications.

In this process, the young learning team that masters the Kunpeng Ascend ecosystem is also constantly growing.

What kind of innovation system should our country build?

It can be seen that the new paradigm of university-enterprise cooperation, Huawei has officially set sail.

Since the establishment of the Computing Product Line in 2019, Huawei quickly signed a smart base cooperation project with the Ministry of Education in 2020, and carried out teaching cooperation in 72 top universities across the country.

At that time, some technical knowledge of Kunpeng/Ascend had already been integrated into some compulsory courses in university undergraduate programs.

However, the investment in universities is a medium and long-term cultivation process. Only by allowing students and teachers to understand the relevant technologies first can they play a greater role in the future.

Therefore, Huawei plans to invest 10 billion yuan per year to develop the native ecosystem and talents of Kunpeng and Ascend. The implementation of this strategy will provide more abundant resources and broader development space for university talents and developers, and has also launched a plan to donate 100,000 Kunpeng development boards and Ascend inference development boards to encourage them to actively explore and apply Kunpeng and Ascend technologies in teaching experiments, competitions, and scientific and technological innovation.

According to this plan, teachers and students can closely contact and try out the development boards. Whether it is the teaching of teachers or the scientific research experiments, teachers and students in universities can inspire new ideas on the development boards to do the innovation they want to do.

The OrangePi AIpro development board co-launched by Orange Pi and Huawei Ascend meets the needs of most AI algorithm prototype verification and inference application development, and can be widely used in AI edge computing, deep visual learning, drones, cloud computing and other fields, showing strong capabilities and wide applicability.

On the other hand, China's current special situation - the technological blockade from the outside world - also means that the time left for us is limited. We must have an independent and controllable technology stack.

Native development has become a must for the future. Only Made in China is most in line with China's future big country trend.

As localization becomes a trend, domestic technology stacks such as Kunpeng/Ascend will also be ubiquitous in various IT infrastructures.

The launch of the Excellence Center and Incubation Center has also made the industry more confident.

It can be foreseen that after a few years of incubation, the scientific research personnel who master the domestic technology foundation will continuously carry forward the Kunpeng/Ascend technology route, and incubate enough scientific research results that lead the world.

Source
Disclaimer: The content above is only the author's opinion which does not represent any position of Followin, and is not intended as, and shall not be understood or construed as, investment advice from Followin.
Like
Add to Favorites
Comments