CFG_Labs - e/acc

CFG_Labs - e/acc

839个推特粉丝

关注

We build the deep thinking community for Web3 and AGI Contact: http://mirror.xyz/infinet.eth http://discord.gg/RDRYgnmSTd

动态

CFG_Labs - e/acc

中国没有一家市值超过1兆美元的公司，因为有竞争。中国没有一家苹果，但有OPPO、小米和华为。中国没有一家特斯拉，但有比亚迪、蔚来、小鹏和理想汽车。中国没有一家亚马逊，但有淘宝、抖音和京东。美国的万亿美元公司是受保护的垄断企业。股东们可能很开心，但消费者却必须忍受糟糕的Pixel手机、漏洞百出的Windows系统、价格过高的特斯拉中配车型以及Facebook机器人产生的垃圾内容。中国消费者能以更低的价格买到更好的产品。

“China doesn’t even have 1 company valued > 1T USD.” Now, isn’t this absolutely wonderful?! The government ensures that wealth generated accrues to the workers, the country and her people and some to the shareholders. When fairly distributed, some shareholders may become x.com/_mm85/status/2…

CFG_Labs - e/acc

DeepSeek 强势回归！ “基于可扩展查找的条件记忆：大型语言模型的稀疏性新维度” 他们引入了 Engram 模块，该模块基于现代化的哈希 N-gram 嵌入，添加了 O(1) 查找式记忆功能。机制分析表明，Engram 减少了对早期层静态模式重建的需求，使模型在关键部分（推理）上能够更有效地“深入”运行。论文：github.com/deepseek-ai/Engram/...…

CFG_Labs - e/acc

我最喜欢的发现之一是：位置嵌入就像辅助轮。它们有助于模型收敛，但会损害长上下文泛化能力。我们发现，如果在预训练后直接删除位置嵌入，并将预算调整到原预算的不到 1%，就能解锁巨大的上下文窗口。

Introducing DroPE: Extending the Context of Pretrained LLMs by Dropping Their Positional Embeddings https://pub.sakana.ai/DroPE/ We are releasing a new method called DroPE to extend the context length of pretrained LLMs without the massive compute costs usually associated with

Loading..