Mistral open-sources its first 22B code model, breaking the record and supporting more than 80 programming languages

05-30

This article is machine translated

Show original

【Introduction】 Just now, French AI startup Mistral released its first code generation model Codestral. It not only supports 32K long context windows and more than 80 programming languages, but also achieves performance similar to 70B Llama 3 with 22B parameters. Currently, API and IDE plug-in are open for users to use.

Mistral, a truly open AI company, has launched new products in a low-key manner.

This time, they released their first code generation model, Codestral, which supports more than 80 programming languages and 32K long context windows.

Not only did it achieve stunning performance in benchmark tests, but the speed of code generation also made netizens who tried it very satisfied.

Currently, Codestral provides multiple APIs, and the model weights are also exposed on HuggingFace.

Project address: https://huggingface.co/mistralai/Codestral-22B-v0.1/tree/main

Code Generation New SOTA

Codestral's training data contains more than 80 programming languages, including the most popular Python, Java, C, C++ and Bash, as well as front-end languages such as HTML and JavaScript, and also performs well on Swift and Fortran.

Tasks that can be accomplished by the model include writing functions for specific functionality, writing tests, and filling in code.

In addition, since Codestral is also fluent in English, it can also interact with developers, helping to improve engineers' coding skills and reduce errors and vulnerabilities.

The interactive features of the model can be used free of charge through the Le Chat conversation interface.

Online address: https://chat.mistral.ai/chat

As a model with only 22B parameters, Codestral achieves a long context window of 32K, which is four times that of Llama 3's 70B.

Codestral uses the Llama framework, but its average HumanEval score in 7 languages exceeds CodeLlama and is on par with Llama 3.

RepoBench is a new benchmark for evaluating repository-level code completion tasks, which tests the model's ability to retrieve and understand long contexts across files. On RepoBench, Codestral achieved SOTA results using the Python language.

In addition, Codestral also achieved good results in evaluations of other languages, including C++, bash, Java, PHP, Typescript, and C#.

The FIM benchmark can evaluate the performance of models on the middle filling task, but CodeLlama and Llama do not directly support this feature.

In the FIM task, Codestral uses fewer parameters and outperforms DeepSeek Coder 33B in Python, JavaScript, and Java.

Currently, Mistral has opened two APIs for developers to call Codestral, namely codestral.mistral.ai and api.mistral.ai. The former has an 8-week free trial period, and the latter is charged by token.

In addition, you can use Codestral's features in VSCode or JetBrains' IDE through Continue.dev or Tabnine plugins.

Developers have already used

After all, benchmarks are just references. You can only know whether a code tool is useful by trying it.

Some netizens lamented that "80 languages are too crazy" and "Finally someone remembered Swift."

Moreover, it can be seen from actual tests that Codestral's code generation speed is very fast and the response delay is also very short.

GPT-4o and Codestral were given the same task of implementing a basic publish/subscribe system in Go.

Although the response delay of both models is very short, when Codestral was finished, GPT-4o was only halfway through writing, and the difference in generation speed was immediately apparent.

Some developers analyzed that although Codestral is not the largest and best code model, they will still switch from Claude Opus to Codestral.

Because the model does contain more cutting-edge knowledge and can help write the latest AI code, but neither ChatGPT nor Opus can do this.

But some Python engineers complained: "No LLM understands that in Python 3.9 and later, it is no longer necessary to use from typing import List."

“GPT-4, GPT-4o, Claude Opus, Gemini, and Codestral all fail to understand this. Even when it is explicitly stated, they still fail to understand it.”

It seems that one of the few remaining advantages of human programmers is "correcting mistakes as soon as they are discovered."

References:

https://mistral.ai/news/codestral/

This article comes from the WeChat public account "New Intelligence" (ID: AI_era) , edited by Qiao Yanghaokun, and published by 36Kr with authorization.

Source

Disclaimer: The content above is only the author's opinion which does not represent any position of Followin, and is not intended as, and shall not be understood or construed as, investment advice from Followin.

Add to Favorites

Comments

Relevant content

All-in station

The ringleader of a cryptocurrency scam worth approximately $74 million has been sentenced to 20 years in prison.

Coin68

MegaETH launches mainnet, aiming for 50,000 TPS and a Block Time of 10 milliseconds.

MEGA

BlockTempo

Bitcoin surged to $71,000, mirroring the rise in US stocks, while Ethereum fluctuated and then fell back to $2,150. Bitmine buy the dips the opportunity to buy 80 million worth of ETH.

BTC

0.88%