avatar
链研社
73,125 Twitter followers
Follow
穿越牛熊的数据派交易员|运营老炮|Web3创业OG|币安广场创作者,奉行周期大于一切,关注真正影响投资决策的信息
Posts
avatar
链研社
China's "efficiency revolution" in computing power is more effective than expanding production lines for storage. A counterintuitive fact: Chinese AI companies are achieving similar results with less memory, and the research papers are open source. This could potentially reduce the inference costs of the three major overseas AI companies—OpenAI, Anthropic, and Gemini—by an order of magnitude, increasing their gross profit margins while simultaneously reducing their memory requirements by an order of magnitude. Taking DeepSeek's MLA architecture, KV caching optimization, and various model quantization technologies as examples, these actions directly and significantly reduce the GPU memory usage and bandwidth requirements during the inference stage, resulting in a precipitous drop in the cost of generating tokens per unit. Zhipu's ultra-high-speed inference and Alibaba and Xiaomi's Qianwen's caching billing have been reduced to one-tenth. What is the essence of these actions? They all focus on algorithmic efficiency compression, maximizing the utilization of computing power. However, the market is using old maps to find new paths. US AI stocks are still continuously investing in capital expenditures, locking in large amounts of production capacity and computing power in advance. $700 billion in capital expenditure is enough to make the entire AI upstream and downstream industry chain celebrate. This logic is correct; the demand for computing power and memory is indeed still very large and growing rapidly. The problem is that it overlooks another curve: China's potential for efficiency improvements in computing power optimization is equally astonishing. Everyone is betting that the "water sellers" can continue to make money, but no one has noticed that those mining gold have suddenly learned to recycle water. If Chinese AI companies further reduce memory usage efficiency by 50%, will the narrative of storage stocks, whose valuations are currently propped up by capital, still hold true? The current exorbitant profits in the AI hardware industry chain are largely built on an absolute dependence on the highest-end HBM high-bandwidth memory. If the model's memory demand decreases significantly, it could directly break the monopoly premium of the existing leading manufacturers, and the underlying logic of storage and computing power stocks, whose valuations are propped up by capital expenditure, would loosen. It seems that no one in the market is seriously calculating how much memory China's efficiency revolution at the algorithm layer can actually save. However, objectively speaking, if inference costs and memory usage are reduced by 50%, it could lead to AI agents making high-frequency API calls around the clock and a massive explosion in AI applications. Even if the amount used per call is less, if the total call frequency increases tenfold, the overall absolute demand for memory and computing power will still surge. China's computing power is more effective than expanding storage production lines, potentially breaking the monopoly premium of existing leading manufacturers. This is a risk that needs to be monitored, and it remains to be seen how far this path of computing power efficiency can go and whether it can continue to improve and optimize. What is uncertain is how long this "unpriced" window of opportunity will last. Perhaps three months, perhaps a year.
avatar
链研社
I recommend the open-source project plan-tree from this group member. Its approach is to organize all your discussions with AI, including solutions, decisions made, current progress, and where to continue next, into a Markdown planning tree. It doesn't make decisions for you, but rather makes your decisions clearer and more complete. It solves two of the most important things in AI programming. First, it aligns costs. Each time you start a new session, you directly input the planning tree without needing to explain the background again. Second, it forces you to truly think things through. When you're writing your decisions into the planning tree, you'll find that many "almost understand" statements are actually "unclear." It forces you to transform vague understanding into structured text; this process itself clarifies your thinking. plan-tree uses a "large loop" workflow: first, thoroughly discuss, clarify, and document feasible solutions; then, let the AI execute them in batches; after execution, write the progress, evidence, and remaining issues back into the planning tree. Plant the tree first, then let the machine chop the wood. I think this is the proper approach to AI programming. Think about it: the scarcest resource in the AI era isn't execution ability—AI provides that—but rather a sense of direction—the kind of direction where you know "what I'm doing, why I'm doing it, and where I'm going next." A plan-tree is a tool that helps you maintain this clarity. It can reduce deviations in AI's functional implementation. If you have similar issues, I recommend giving it a try.
huangserva
@servasyy_ai
🔥🔥Strongly recommend this group member's work!! It has helped me tremendously with my recently developed multi-Agent product! It transforms short-term plans into a long-term, stable, and structured tree. plan-tree is a skill for long-term preservation of project planning
loading indicator
Loading..