【Introduction】 Just now, French AI startup Mistral released its first code generation model Codestral. It not only supports 32K long context windows and more than 80 programming languages, but also achieves performance similar to 70B Llama 3 with 22B parameters. Currently, API and IDE plug-in are open for users to use.
Mistral, a truly open AI company, has launched new products in a low-key manner.
This time, they released their first code generation model, Codestral, which supports more than 80 programming languages and 32K long context windows.
Not only did it achieve stunning performance in benchmark tests, but the speed of code generation also made netizens who tried it very satisfied.
Currently, Codestral provides multiple APIs, and the model weights are also exposed on HuggingFace.
Project address: https://huggingface.co/mistralai/Codestral-22B-v0.1/tree/main
Code Generation New SOTA
Codestral's training data contains more than 80 programming languages, including the most popular Python, Java, C, C++ and Bash, as well as front-end languages such as HTML and JavaScript, and also performs well on Swift and Fortran.
Tasks that can be accomplished by the model include writing functions for specific functionality, writing tests, and filling in code.
In addition, since Codestral is also fluent in English, it can also interact with developers, helping to improve engineers' coding skills and reduce errors and vulnerabilities.
The interactive features of the model can be used free of charge through the Le Chat conversation interface.
Online address: https://chat.mistral.ai/chat
As a model with only 22B parameters, Codestral achieves a long context window of 32K, which is four times that of Llama 3's 70B.
Codestral uses the Llama framework, but its average HumanEval score in 7 languages exceeds CodeLlama and is on par with Llama 3.
RepoBench is a new benchmark for evaluating repository-level code completion tasks, which tests the model's ability to retrieve and understand long contexts across files. On RepoBench, Codestral achieved SOTA results using the Python language.
In addition, Codestral also achieved good results in evaluations of other languages, including C++, bash, Java, PHP, Typescript, and C#.
The FIM benchmark can evaluate the performance of models on the middle filling task, but CodeLlama and Llama do not directly support this feature.
In the FIM task, Codestral uses fewer parameters and outperforms DeepSeek Coder 33B in Python, JavaScript, and Java.
Currently, Mistral has opened two APIs for developers to call Codestral, namely codestral.mistral.ai and api.mistral.ai. The former has an 8-week free trial period, and the latter is charged by token.
In addition, you can use Codestral's features in VSCode or JetBrains' IDE through Continue.dev or Tabnine plugins.
Developers have already used
After all, benchmarks are just references. You can only know whether a code tool is useful by trying it.
Some netizens lamented that "80 languages are too crazy" and "Finally someone remembered Swift."
Moreover, it can be seen from actual tests that Codestral's code generation speed is very fast and the response delay is also very short.
GPT-4o and Codestral were given the same task of implementing a basic publish/subscribe system in Go.
Although the response delay of both models is very short, when Codestral was finished, GPT-4o was only halfway through writing, and the difference in generation speed was immediately apparent.
Some developers analyzed that although Codestral is not the largest and best code model, they will still switch from Claude Opus to Codestral.
Because the model does contain more cutting-edge knowledge and can help write the latest AI code, but neither ChatGPT nor Opus can do this.
But some Python engineers complained: "No LLM understands that in Python 3.9 and later, it is no longer necessary to use from typing import List."
“GPT-4, GPT-4o, Claude Opus, Gemini, and Codestral all fail to understand this. Even when it is explicitly stated, they still fail to understand it.”
It seems that one of the few remaining advantages of human programmers is "correcting mistakes as soon as they are discovered."
References:
https://mistral.ai/news/codestral/
This article comes from the WeChat public account "New Intelligence" (ID: AI_era) , edited by Qiao Yanghaokun, and published by 36Kr with authorization.






