Alibaba Cloud confirms that the S1 model of Fei-Fei Li’s team is based on Qwen training

02-06

This article is machine translated

Show original

PANews, February 6 news, according to Sina Technology, researchers from Stanford University and the University of Washington, including Li Feifei, trained an artificial intelligence reasoning model called s1 with less than $50 in cloud computing costs, and its performance in mathematical and coding ability tests is similar to that of advanced reasoning models such as OpenAl's o1 and Depsek's R1, which has attracted widespread attention. However, it was soon pointed out that the s1 model "was not trained from scratch", and its base model was the "Alibaba Tongyi Qwen model". In this regard, the reporter verified with Alibaba Cloud, and Alibaba Cloud confirmed the news and responded that: "They used the open-source Alibaba Tongyi Qwen2.5-32B-Instruct model as the base, and after 26 minutes of supervised fine-tuning on 16 H100 GPUs, they trained a new model s1-32B, which achieved results comparable to the advanced reasoning models o1 and R1 of OpenAI and DeepSeek in terms of mathematical and coding abilities, and even outperformed o1-preview by 27% on competition math problems."

Source

Disclaimer: The content above is only the author's opinion which does not represent any position of Followin, and is not intended as, and shall not be understood or construed as, investment advice from Followin.

Add to Favorites

Comments

Relevant content

TechFlow

Cryptocurrency Crash: Veteran Crypto Yi Lihua Loses $700 Million in a Week

BTC

2.75%

ODAILY

The day CZ missed his best investment, Crypto missed out on AI.

BTC

2.75%

Bitcoin Sistemi

Watch Out: Massive Token Unlocks Coming in 16 Altcoins Next Week – Here’s the Day-by-Day, Hour-by-Hour List

5.6%