Why did Liang Wenfeng create DeepSeek?

This article is machine translated
Show original
Here is the English translation of the text, with the specified terms preserved:

Source: AI Technology Review

This is the hottest tech star at the beginning of 2025. In just a few days, Liang Wenfeng's past, from small to large, has been revealed to the public, including his unfinished new house and the tent he sleeps in at home, all of which have become symbols of his unique personality.

While a unique personality is certainly something that people find fascinating, it is not the key to success. This anonymous university student has only his thoughts and abilities to rely on over the past decade or so.

Everyone is curious about this question: why was it Liang Wenfeng who created DeepSeek? There are certainly factors of the times, as well as his personal experiences that are different from other large model researchers. But AI Technology Review believes that understanding what kind of person Liang Wenfeng is is the key to understanding this question.

1 Finding Talent Does Not Require Labels

Headhunters think it's too difficult to find people for Liang Wenfeng's company.

A headhunter who has been working closely with TRON since 2021 told us that recruiting is making him "want to cry" because the difficulty is too high.

"Tsinghua undergraduate and doctoral degree, six top conference papers, you think there would be no problem, right? But the resume was directly rejected; a Tsinghua undergraduate, MIT doctoral student was eliminated in the second round of interviews."

If you want to find candidates within large companies, he believes that TRON and DeepSeek basically won't benchmark against domestic companies, they will only benchmark against overseas giants like Google and Meta.

Another headhunter couldn't help but feel troubled when it came to DeepSeek, "They are too picky. I recommended a young middle manager who performed very well at ByteDance, but he didn't pass after the interview. I was very puzzled, so I asked them, and the answer I got was that the person didn't have a passion for AI." People who have done some AI agent-related projects are generally not likely to get such feedback.

Liang Wenfeng has no labels for talent, regardless of educational background or past performance, he only looks at the individual's abilities and personal qualities.

The extremely high talent threshold has created the current DeepSeek. Among domestic large model teams, DeepSeek's talent depth may not be enough to be on par with top companies, but its talent density can be said to be in the first tier.

In addition to DeepSeek's high salaries, the management model that fully respects creativity and ideas is also key to retaining these talents. "No fixed team, no reporting relationship, no annual plan" is more like trust than management. The Netflix Culture Handbook once said, "Excellent colleagues and daunting challenges are the biggest factors attracting people to work at the company." For AI practitioners, there is no greater challenge than AGI.

To do the most difficult things, you need to find the best people and provide sufficient resources and trust. Trusted top talents often bring tremendous explosive power, a theory that can be verified in the rise of Douyin.

During the 2018 Spring Festival, Douyin's daily new user growth exceeded tens of millions. A growth product manager once mentioned that this growth project had no performance pressure at all, and he had sent an email to the finance department, and his account had an additional billion-level advertising budget. At that time, he realized, "With a team like this, what can't be won?"

DeepSeek is the same. Those whose resumes were filtered out must not have been because of academic qualifications; those who failed the interview must not have been because of ability. The focus of talent demand can be summed up in one sentence: Is this a person who can be trusted to work towards AGI?

This is DeepSeek's talent view, and understanding this talent view is the first step to understanding Liang Wenfeng.

2 Minimalist Values

Although he has been doing quantitative trading for many years, Liang Wenfeng does not consider himself a finance person, he sees himself as "doing AI, just in the quantitative scenario."

Almost everyone who has communicated with Liang Wenfeng says that he is a person who is not easily distracted by external factors, "his way of thinking is extremely pure, especially emphasizing first principles," "speaks slowly," and "hits the nail on the head as soon as he opens his mouth."

The characteristics of quantitative investment perfectly match his minimalist style - it does not require dealing with complex upstream and downstream industry chains, just focusing on pure market data.

To this day, Liang Wenfeng still often immerses himself in his own technical world, focusing on solving problems. For example, when it comes to building large models, he will tell others, "Just do it if you've figured it out, as long as you have a card," other difficulties are not within his consideration.

His attitude towards money is the same. Money is for investment or charitable causes, as long as it can be spent in the right place, losses are not worth mentioning.

At the end of 2023, there was a sign language large model project aimed at supporting the deaf and mute, and they found Liang Wenfeng to raise investment. Liang Wenfeng pointed out that the advantage of this project is its strong public welfare nature, the disadvantage is the limited market size, and the hidden danger is that it is a project of a top university student team, and they may not persist in the long run.

Although he is very likely to get no return, he still proposed that as long as the team is willing to continue to promote the project, he is willing to invest.

In the past, Liang Wenfeng would allocate 500 million yuan per year for investment or charity, and now he is spending that money on DeepSeek. Stock trading is to make money, and investing in large models is for AGI, that's all.

DeepSeek has nearly 20,000 cards, and he is extremely generous with computing power. For the above-mentioned sign language large model team, he has promised that the computing power cluster will be open to them at any time. But he is also a bit "stingy", requiring a very high utilization rate of these nearly 20,000 cards, striving to fill them up and not let them idle.

These two behaviors seem contradictory, but if explained from the perspective of minimalism, it makes sense: the cards exist to be used, and they should be used to the fullest, without any waste.

3 Not Limited by Commercialization

Without spending a penny on advertising, DeepSeek's app gained one hundred million users in just 7 days. How does Liang Wenfeng view this miraculous growth?

An investor specifically asked Liang Wenfeng this question during the Spring Festival, but Liang Wenfeng seemed completely unconcerned about such huge traffic, and the investor's response was, "It's still a long way from AGI."

This is not Liang Wenfeng putting on airs. According to AI Technology Review, DeepSeek has only assigned two or three people to be responsible for app maintenance, dialogue web page development, and charging backend management. So it's normal for it to be not user-friendly.

DeepSeek's exploits in the B-end market are more widely known. For example, their private deployment pricing was only 450,000 yuan, which not only includes the use rights of an H20 or 910b, but also comes with large model services, with a usage period of one year. At the same price on Huawei Cloud, you can only rent the use rights of 910b for one year, which means that DeepSeek's large models are almost free.

Private deployment doesn't make money, and DeepSeek doesn't care about making money from APIs either. An employee of a large company who interfaced with DeepSeek complained that it has a kind of "use it if you want, don't use it if you don't" attitude, it's always very difficult to use, and it never adjusts.

No matter how large the customers and call volumes are, they are not worth special attention. All large companies have to queue up during peak periods, and the user experience is very poor. Feedback from major customers is also abundant, requiring DeepSeek to expand capacity, at least respond more smoothly, and not have one out of every two requests fail, which is almost unbearable.

The outside world is in an uproar, but Liang Wenfeng doesn't seem to care much about this.

How should this situation be solved? Many companies are troubled by this. According to some internal information, Liang Wenfeng believes that large companies are fully capable of figuring out how to solve the problem of request failures on their own, and they should provide their own fallback, rather than over-relying on DeepSeek to ensure service.

This answer is simply enough to make people laugh.

It can be said that the current Liang Wenfeng does not care about commercialization at all.

While many teams are focused on applications, Liang Wenfeng once told a good friend, "Don't keep looking at application and industry implementation. If you do that now, you'll only end up constraining yourself, because the time hasn't come yet, and everything you're thinking now is wrong. And if you invest more time, energy and money on the wrong path."

This is advice to a good friend, and also a practice of one's own. Investing energy into applications, investing into commercialization, for TRON, no matter what he does, is a wrong path.

But the right path has always been only one, and he is now on the right path.

Source
Disclaimer: The content above is only the author's opinion which does not represent any position of Followin, and is not intended as, and shall not be understood or construed as, investment advice from Followin.
Like
Add to Favorites
Comments