Apple releases open source language model OpenELM

04-25

This article is machine translated

Show original

Odaily Odaily News Before WWDC24, Apple released an "efficient language model with an open source training and reasoning framework" called OpenELM on the Hugging Face platform. This is an open source language model, and its source code, pre-trained model weights and training recipes are available in Apple's Github library. According to reports, OpenELM uses a hierarchical scaling strategy to effectively distribute the parameters of each layer of the Transformer model, thereby improving accuracy. For example, with a parameter volume of about 1 billion, OpenELM has an accuracy improvement of 2.36% compared with OLMo, while the number of pre-trained tokens required is only 50% of the original. Unlike the previous practice of only providing model weights and reasoning code and pre-training on private datasets, the version released by Apple includes a complete framework for training and evaluating language models on public datasets, including training logs, multiple checkpoints, and pre-training configurations. In addition, it also released code to convert the model to the MLX library for reasoning and fine-tuning on Apple devices. This comprehensive release aims to enhance and consolidate the open research community and pave the way for future open research work. (IT Home)

Source

Disclaimer: The content above is only the author's opinion which does not represent any position of Followin, and is not intended as, and shall not be understood or construed as, investment advice from Followin.

Add to Favorites

Comments

Relevant content

ODAILY

Gold fell more than 4% at one point, and silver plummeted 11%, as a sharp drop in US stocks triggered a sell-off in algorithmically traded precious metals.

PANews

Meme wanes, narratives cool: Solana's cyclical boom ends as it falls below $80.

BNB

2.78%

All-in station

Coinbase reports lackluster earnings, causing its stock price to fall.

BTC

1.5%