NVIDIA predicts that AI server bottleneck is memory, with demand for DRAM and NAND expected to surge.

This article is machine translated
Show original

With NVIDIA identifying "memory processing power" as a key bottleneck in AI server architectures, demand for existing memory semiconductors, such as standard DRAM and NAND flash, is likely to increase as well. Industry analysts predict that the explosive computational demands of AI inference tasks are driving a shift toward architectures that require the integrated operation of various memory resources, in addition to high-bandwidth memory (HBM).

This assessment is based on comments made by NVIDIA CEO Jensen Huang during his keynote speech at CES 2026 in Las Vegas. Huang predicted that the AI industry would continue its strong growth through 2026, citing memory capacity and bandwidth shortages as technological bottlenecks. This suggests that server architectures, once centered on GPUs (graphics processing units), are expanding to encompass diverse memory layers, including DRAM and NAND.

Samsung Securities explained that "Agent AI" is at the heart of this change. Agent AI is a technology that enables AI to make independent judgments and respond to situations in various environments. This requires processing significantly larger amounts of data faster than existing models. Consequently, the number of computational calls and data sequence length in the inference stage of AI models has rapidly increased, making sufficient memory bandwidth and capacity essential to handle this demand.

Reflecting this trend, NVIDIA has presented a new storage memory architecture that includes the BlueField-4 DPU (data processing unit) within its AI server system. This architecture utilizes not only high-bandwidth memory (HBM), but also low-power DRAM (LPDDR) and even NVMe-based high-capacity NAND as hierarchical caches. In particular, the focus is on dramatically increasing overall system efficiency by operating the computational processing structure called "KV Cache" (Key-Value Cache) in a form directly connected to the GPU.

The amount of memory and storage installed in a single AI server is expected to increase further in the future. This means that the need for memory content may increase faster than the growth in GPU computing performance. NVIDIA's message ultimately emphasizes that the core of AI computing architecture lies not only in the GPU itself, but also in the harmonious integration of memory and storage resources.

These changes have the potential to alter the structural demand landscape of the semiconductor industry in the medium to long term. Beyond existing demand centered on high-performance memory, a broader range of memory products, including general DRAM and high-capacity NAND flash, could see simultaneous growth. This is expected to have a positive impact on the domestic memory semiconductor industry.

Get real-time news... Go to TokenPost Telegram

Copyright ยฉ TokenPost. Unauthorized reproduction and redistribution prohibited.

#NVIDIA #AIServer #MemorySemiconductor #HBM #AgentAI #SamsungSecurities

Source
Disclaimer: The content above is only the author's opinion which does not represent any position of Followin, and is not intended as, and shall not be understood or construed as, investment advice from Followin.
Like
89
Add to Favorites
19
Comments