CFG_Labs - e/acc

CFG_Labs - e/acc

839個推特粉絲

關注

We build the deep thinking community for Web3 and AGI Contact: http://mirror.xyz/infinet.eth http://discord.gg/RDRYgnmSTd

動態

CFG_Labs - e/acc

中國沒有一家市值超過1兆美元的公司，因為有競爭。中國沒有一家蘋果，但有OPPO、小米和華為。中國沒有一家特斯拉，但有比亞迪、蔚來、小鵬和理想汽車。中國沒有一家亞馬遜，但有淘寶、抖音和京東。美國的萬億美元公司是受保護的壟斷企業。股東們可能很開心，但消費者卻必須忍受糟糕的Pixel手機、漏洞百出的Windows系統、價格過高的特斯拉中配車型以及Facebook機器人產生的垃圾內容。中國消費者能以更低的價格買到更好的產品。

“China doesn’t even have 1 company valued > 1T USD.” Now, isn’t this absolutely wonderful?! The government ensures that wealth generated accrues to the workers, the country and her people and some to the shareholders. When fairly distributed, some shareholders may become x.com/_mm85/status/2…

CFG_Labs - e/acc

DeepSeek 強勢迴歸！ “基於可擴展查找的條件記憶：大型語言模型的稀疏性新維度” 他們引入了 Engram 模塊，該模塊基於現代化的哈希 N-gram 嵌入，添加了 O(1) 查找式記憶功能。機制分析表明，Engram 減少了對早期層靜態模式重建的需求，使模型在關鍵部分（推理）上能夠更有效地“深入”運行。論文：github.com/deepseek-ai/Engram/...…

CFG_Labs - e/acc

我最喜歡的發現之一是：位置嵌入就像輔助輪。它們有助於模型收斂，但會損害長上下文泛化能力。我們發現，如果在預訓練後直接刪除位置嵌入，並將預算調整到原預算的不到 1%，就能解鎖巨大的上下文窗口。

Introducing DroPE: Extending the Context of Pretrained LLMs by Dropping Their Positional Embeddings https://pub.sakana.ai/DroPE/ We are releasing a new method called DroPE to extend the context length of pretrained LLMs without the massive compute costs usually associated with

Loading..