Qwen releases Qwen2.5-VL-32B multimodal model, with performance exceeding that of the 72B large model

03-25

This article is machine translated

Show original

PANews reported on March 25th that according to the Qwen team's announcement, the Qwen2.5-VL-32B-Instruct model has been officially open-sourced. With 32B parameters, it demonstrates excellent performance in tasks such as image understanding, mathematical reasoning, and text generation. The model has been further optimized through reinforcement learning, with responses more aligned with human preferences. It has surpassed the previously released 72B model in multi-modal evaluations like MMMU and MathVista. Compared to the previous Qwen2.5-VL series models, the 32B model has the following improvements: responses more aligned with human subjective preferences, with adjusted output styles that are more detailed, better formatted, and more in line with human preferences; enhanced mathematical reasoning capabilities with significantly improved accuracy in solving complex mathematical problems; and more refined image understanding and reasoning, showing stronger accuracy and fine-grained analysis capabilities in tasks such as image parsing, content recognition, and visual logical reasoning.

Source

Disclaimer: The content above is only the author's opinion which does not represent any position of Followin, and is not intended as, and shall not be understood or construed as, investment advice from Followin.

Add to Favorites

Comments

Relevant content