Tether releases QVAC cross-platform BitNet LoRA framework: supporting training billion-parameter AI models on consumer devices.

avatar
ODAILY
03-17
This article is machine translated
Show original

According to an official announcement, Tether has released a cross-platform BitNet LoRA fine-tuning framework for QVAC Fabric, optimizing training and inference for Microsoft BitNet (1-bit LLM). This framework significantly reduces computational and memory requirements, enabling Odaily parameter models to be trained and fine-tuned on laptops, consumer GPUs, and smartphones.

This solution is the first to enable fine-tuning of the BitNet model on mobile GPUs (including Adreno, Mali, and Apple Bionic). Tests show that a 125M parameter model can be fine-tuned in about 10 minutes, a 1B model in about 1 hour, and it can even be extended to a 13B parameter model on mobile devices.

Furthermore, the framework supports heterogeneous hardware such as Intel, AMD, and Apple Silicon, and for the first time achieves 1-bit LLM LoRA fine-tuning on non-NVIDIA devices. In terms of performance, the BitNet model achieves inference speeds of 2 to 11 times faster on mobile GPUs than on CPUs, while reducing memory usage by up to approximately 77.8% compared to traditional 16-bit models.

Tether stated that this technology has the potential to break the dependence on high-end computing power and cloud infrastructure, promote the development of AI training towards decentralization and localization, and provide a foundation for new application scenarios such as federated learning.

Source
Disclaimer: The content above is only the author's opinion which does not represent any position of Followin, and is not intended as, and shall not be understood or construed as, investment advice from Followin.
Like
Add to Favorites
Comments