PrismML introduces the 1.58-bit model Ternary Bonsai, with a 9x reduction in parameters and surpassing its competitors in intelligence.

This article is machine translated
Show original
According to ME News, on April 17th (UTC+8), as monitored by Beating, PrismML released the Ternary Bonsai series of language models. Through 1.58-bit (ternary weights) technology, it reduces the model's memory usage to one-ninth of a 16-bit model while maintaining high performance. This series includes three parameter scales: 8B, 4B, and 1.7B, and is now open-sourced on Hugging Face and supports native operation on Apple devices. The 1.58-bit model refers to limiting the weights in the neural network to three values: {-1, 0, +1}. Compared to the previous 1-bit model (with weights only {-1, +1}) which pursued extreme compression, introducing the value "0" effectively eliminates redundant connections, allowing the model to retain complex reasoning capabilities in a very small size. The newly released Ternary Bonsai 8B weight file is only 1.75 GB, achieving an average benchmark score of 75.5. This is not only 5 points higher than its own 1-bit version, but also significantly outperforms similar dense models like Qwen3 in terms of "intelligent density" (performance contribution per GB of VRAM). Energy efficiency and running speed are other core advantages of this series. On the iPhone 17 Pro Max, the 8B version can reach a running speed of 27 tok/s, with an energy efficiency improvement of approximately 3 to 4 times. For developers who need to deploy high-performance AI on edge devices such as mobile phones and laptops, this means that they can obtain near-full-precision model intelligence performance with minimal memory overhead. Currently, Ternary Bonsai models are natively supported on Apple devices through the MLX framework. Model weights are distributed under the Apache 2.0 license. (Source: ME)

Source
Disclaimer: The content above is only the author's opinion which does not represent any position of Followin, and is not intended as, and shall not be understood or construed as, investment advice from Followin.
Like
Add to Favorites
Comments