Apple releases open source language model OpenELM

avatar
ODAILY
04-25
This article is machine translated
Show original
Odaily Odaily News Before WWDC24, Apple released an "efficient language model with an open source training and reasoning framework" called OpenELM on the Hugging Face platform. This is an open source language model, and its source code, pre-trained model weights and training recipes are available in Apple's Github library. According to reports, OpenELM uses a hierarchical scaling strategy to effectively distribute the parameters of each layer of the Transformer model, thereby improving accuracy. For example, with a parameter volume of about 1 billion, OpenELM has an accuracy improvement of 2.36% compared with OLMo, while the number of pre-trained tokens required is only 50% of the original. Unlike the previous practice of only providing model weights and reasoning code and pre-training on private datasets, the version released by Apple includes a complete framework for training and evaluating language models on public datasets, including training logs, multiple checkpoints, and pre-training configurations. In addition, it also released code to convert the model to the MLX library for reasoning and fine-tuning on Apple devices. This comprehensive release aims to enhance and consolidate the open research community and pave the way for future open research work. (IT Home)

Source
Disclaimer: The content above is only the author's opinion which does not represent any position of Followin, and is not intended as, and shall not be understood or construed as, investment advice from Followin.
Like
2
Add to Favorites
Comments