As 2024 comes to an end, venture capitalist Rob Toews from Radical Ventures shares his 10 predictions for AI in 2025:
01. Meta will start charging for the Llama model
Meta is the global standard-bearer for open-source AI. In a high-profile case study of corporate strategy, while competitors like OpenAI and Google have chosen to close-source their cutting-edge models and charge usage fees, Meta has opted to provide its advanced Llama model for free.
The news that Meta will start charging companies that use Llama next year will therefore come as a surprise to many.
To be clear: we are not predicting that Meta will fully close-source Llama, nor that any user of the Llama model will be required to pay.
Instead, we predict that Meta will place more restrictions on the open-source licensing terms of Llama, such that companies using Llama at scale in a commercial setting will need to start paying to use the model.
Technically, Meta has already done this to a limited extent. The company does not allow the largest companies - hyperscale cloud providers and others with over 700 million monthly active users - to freely use its Llama model.
As early as 2023, Meta CEO Mark Zuckerberg said: "If you're Microsoft, Amazon, or Google, and you're basically going to resell Llama, then we should get a cut of that. I don't think that's going to be a huge revenue stream in the short term, but hopefully over time it can be some revenue."
Next year, Meta will significantly expand the range of enterprises that must pay to use Llama, bringing more medium and large-sized companies into the fold.
Keeping up with the frontier of large language models (LLMs) is extremely expensive. To keep Llama on par or close to the latest cutting-edge models from OpenAI, Anthropic, and others, Meta needs to invest billions of dollars per year.
Meta is one of the largest and most well-funded companies in the world. But it is also a public company, ultimately accountable to shareholders.
As the cost of producing frontier models continues to skyrocket, Meta's practice of pouring massive sums into training the next generation of Llama models without an expectation of revenue is becoming increasingly untenable.
Hobbyists, academics, individual developers, and startups will continue to use the Llama model for free next year. But 2025 will be the year Meta starts to seriously monetize Llama.
02.Scaling laws issues
The topic that has generated the most discussion in the AI field in recent weeks is scaling laws, and whether they are about to come to an end.
Scaling laws were first proposed in a 2020 paper by OpenAI, with a simple core concept: as the number of model parameters, training data, and compute increase when training AI models, model performance improves in a reliable and predictable way (technically, their test loss decreases).
The awe-inspiring performance improvements from GPT-2 to GPT-3 to GPT-4 are all the result of scaling laws.
Just like Moore's Law, scaling laws are not actually a true law, but rather a simple empirical observation.
Over the past month, a series of reports have indicated that the leading AI labs are starting to see diminishing returns as they continue to scale up their large language models.
This helps explain why the release of OpenAI's GPT-5 has been repeatedly delayed.
The most common rebuttal to the scaling laws plateauing is that the emergence of inference-time compute opens up an entirely new dimension along which to pursue scaling.
That is, rather than scaling compute massively during training, new inference models like OpenAI's o3 make it possible to scale compute massively during inference, unlocking new AI capabilities by allowing the model to "think for longer".
This is an important point. Inference-time compute does indeed represent an exciting new avenue for scaling and AI performance improvement.
But there is another perspective on scaling laws that is even more important, and has been severely underappreciated in today's discussion. Almost all the discussion of scaling laws, from the original 2020 paper to the current focus on inference-time compute, has centered on language. But language is not the only important data modality.
Think about robotics, biology, world models, or web agents. For these data modalities, scaling laws are far from saturated; rather, they are just getting started.
In fact, rigorous empirical evidence of scaling laws even existing in these novel data modalities has yet to be published.
Startups building foundational models for these new data modalities - such as Evolutionary Scale in biology, PhysicalIntelligence in robotics, and WorldLabs in world models - are trying to identify and leverage the scaling laws in their domains, just as OpenAI successfully did with large language models (LLMs) in the early 2020s.
Next year, expect to see tremendous progress here.
Scaling laws are not going away; they will be just as important in 2025 as they have been. But the center of gravity of scaling law activity will shift from pre-trained LLM to other modalities.
03.Trump and Musk may diverge on AI direction
The new US administration will bring a range of policy and strategic shifts on AI.
To predict the AI trajectory under a Trump presidency, and given Musk's current centrality in the AI space, one might be inclined to focus on the close relationship between the president-elect and Musk.
One can imagine Musk influencing the Trump administration's AI-related developments in a variety of ways.
Given Musk's deep antagonism towards OpenAI, the new administration may take a less friendly stance towards OpenAI in terms of industry engagement, AI regulation, government contract awards, and the like - a real risk that OpenAI faces today.
On the other hand, the Trump administration may be more inclined to support Musk's own companies: for example, by cutting red tape to allow xAI to build data centers and take the lead in frontier model races; providing fast regulatory approval for Tesla to deploy robotaxi fleets, and so on.
More fundamentally, unlike many other tech leaders favored by Trump, Musk is deeply concerned about the safety risks of AI and thus advocates for significant AI regulation.
He has supported the controversial SB1047 bill in California, which seeks to impose meaningful constraints on AI developers. Musk's influence may therefore lead to a more stringent regulatory environment for AI in the US.
However, all of this speculation rests on one key assumption: that Trump and Musk's close relationship will inevitably break down.
As we have seen time and again during Trump's first term, the average tenure of even Trump's most steadfast allies is remarkably short.
Of the cast of characters in Trump's first administration, only a handful remain loyal to him today.
Both Trump and Musk are complex, mercurial, and unpredictable personalities who do not easily collaborate, who exhaust those around them, and whose newfound friendship, while mutually beneficial so far, remains in the "honeymoon phase".
We predict that this relationship will sour before the end of 2025.
What does this mean for the AI world?
It's good news for OpenAI. It's unfortunate news for Tesla shareholders. And it's disappointing news for those concerned about AI safety, as it all but ensures a hands-off approach to AI regulation under a Trump presidency.
04. AI Agents will become mainstream
Imagine a world where you no longer need to directly interact with the internet. Whenever you need to manage subscriptions, pay bills, book doctor appointments, order things on Amazon, make restaurant reservations, or complete any other tedious online tasks, you simply instruct an AI assistant to do it for you.
The concept of this "network proxy" has existed for years. If there were such a product that could function normally, it would undoubtedly be a hugely successful product.
However, there is currently no generally available network proxy that can function normally.
Even companies like Adept, with a blue-chip founding team and billions of dollars in funding, have not been able to realize their vision.
Next year will be the year when network proxies finally start to function well and become mainstream. The continuous progress of language and visual basic models, combined with the recent breakthroughs in "second-system thinking" capabilities due to new inference models and inference time calculations, will mean that network proxies are ready to enter their golden age.
In other words, Adept's idea was correct, just premature. In startups, as in life, timing is everything.
Network proxies will find various valuable enterprise use cases, but we believe the biggest near-term market opportunity for network proxies will be with consumers.
Although the AI hype has not diminished recently, apart from ChatGPT, there are relatively few AI-native applications that have become mainstream consumer applications.
Network proxies will change this, becoming the next true "killer app" in the consumer AI space.
05. The idea of placing AI data centers in space will be realized
In 2023, the key physical resource constraining AI development will be GPU chips. In 2024, it will become power and data centers.
In 2024, there will be hardly any story more attention-grabbing than the AI's enormous and rapidly growing demand for energy as it rushes to build more AI data centers.
Due to the rapid growth of AI, global data center power demand is expected to double between 2023 and 2026, after remaining flat for decades. In the US, data center electricity consumption is expected to approach 10% of total electricity consumption by 2030, up from just 3% in 2022.
Today's energy system is simply not equipped to handle the massive surge in demand brought by AI workloads. A historic collision is about to happen between our energy grid and computing infrastructure, both worth trillions of dollars.
As a potential solution to this dilemma, nuclear power has seen a resurgence this year. Nuclear power is in many ways an ideal energy source for AI: it is zero-carbon, available 24/7, and essentially inexhaustible.
But realistically, new energy sources cannot solve this problem before the 2030s, due to the long lead times for research, project development, and regulation. This applies to traditional nuclear fission power plants, next-generation "small modular reactors" (SMRs), and nuclear fusion power plants.
Next year, a highly unconventional new idea to address this challenge will emerge and attract serious resources: placing AI data centers in space.
AI data centers in space, at first glance, sounds like a bad joke, a venture capitalist trying to mash together too many startup buzzwords.
But in fact, it may make sense.
The biggest bottleneck to rapidly building more data centers on Earth is securing the necessary power. Computational clusters in orbit could enjoy free, unlimited, zero-carbon power 24/7: the sun is always shining in space.
Another key advantage of placing computing in space is that it solves the cooling problem.
One of the major engineering hurdles in building more powerful AI data centers is that running many GPUs in a confined space gets extremely hot, and high temperatures can damage or destroy computing equipment.
Data center developers are turning to expensive and unproven methods like liquid immersion cooling to try to solve this. But space is extremely cold, and any heat generated by computing activity would dissipate harmlessly immediately.
Of course, there are many practical challenges to be solved. An obvious one is whether and how to transmit large amounts of data between orbit and Earth efficiently and at low cost.
This is an open question, but may prove solvable: there are promising prospects for using lasers and other high-bandwidth optical communication technologies.
A YCombinator startup called Lumen Orbit has recently raised $11 million to realize this vision: building a multi-gigawatt data center network in space for training AI models.
As the CEO said, "Why pay $140 million in electricity bills when you can pay $10 million for launches and solar power?"
By 2025, Lumen will not be the only organization seriously pursuing this concept.
Competitors from other startups will emerge. It would not be surprising if one or more hyperscale cloud computing giants also explore this path.
Amazon has already put assets in orbit through Project Kuiper, gaining valuable experience; Google has long funded similar "moonshot" efforts; even Microsoft is no stranger to the space economy.
One can imagine Elon Musk's SpaceX company also playing a role in this area.
06. AI systems will pass the "Turing voice test"
The Turing test is one of the oldest and most famous benchmarks of AI performance.
To "pass" the Turing test, an AI system must be able to communicate via written text in a way that is indistinguishable from a human to an average person.
Thanks to the remarkable progress of large language models, the Turing test has become a solved problem in the 2020s.
But written text is not the only way humans communicate.
As AI becomes increasingly multimodal, one can envision a new, more challenging version of the Turing test - the "Turing voice test". In this test, the AI system must be able to interact via speech with humans, with skills and fluency indistinguishable from a human speaker.
Today's AI systems are still far from achieving the Turing voice test, and solving this problem will require further technological progress. Latency (the lag between human speech and AI response) must be reduced to near-zero to match the experience of conversing with another human.
Spoken AI systems must become better at gracefully handling and processing ambiguous or misunderstood inputs in real-time, such as interrupted speech. They must be able to engage in long, multi-turn, open-ended dialogues, while remembering earlier parts of the discussion.
And crucially, spoken AI agents must learn to better understand the non-verbal cues in speech. For example, what it means if a human speaker sounds angry, excited, or sarcastic, and generate those non-verbal signals in their own speech.
As we approach the end of 2024, spoken AI is at an exciting inflection point, driven by fundamental breakthroughs like speech-to-speech models.
Today, few areas of AI are progressing as rapidly in both technology and business as spoken AI. The latest spoken AI technologies are expected to take a leap forward in 2025.
07. Autonomous AI systems will make major advances
For decades, the concept of recursively self-improving AI has been a recurring theme in the AI community.
As early as 1965, I.J. Good, a close collaborator of Alan Turing, wrote: "Let an ultraintelligent machine be defined as a machine that can far surpass all the intellectual activities of any man however clever."
"Since the design of machines is one of these intellectual activities, an ultraintelligent machine could design even better machines; there would then unquestionably be an 'intelligence explosion,' and the human race might be left far behind."
The idea that AI can invent better AI is a clever one. But even today, it still retains a science fiction aura.
Here is the English translation of the text, with the specified terms preserved:However, although this concept has not yet gained widespread recognition, it is actually starting to become more real. Researchers at the forefront of artificial intelligence science have begun to make tangible progress in building artificial intelligence systems, and the artificial intelligence systems themselves can also build better artificial intelligence systems.
We predict that this research direction will become mainstream next year.
To date, the most notable public example of research along this line is Sakana's "AI Scientist".
The "AI Scientist" was released in August this year, and it convincingly demonstrates that an AI system can indeed conduct AI research completely autonomously.
Sakana's "AI Scientist" has carried out the entire life cycle of AI research: reading existing literature, generating new research ideas, designing experiments to test these ideas, executing these experiments, writing research papers to report its research results, and then peer-reviewing its own work.
All of this work has been completed autonomously by the AI, without human intervention. You can read online some of the research papers written by the AI Scientist.
OpenAI, Anthropic, and other research labs are investing resources in the idea of "automated AI researchers," but there has been no public acknowledgment so far.
As more and more people recognize that the automation of AI research is actually becoming a real possibility, there are expected to be more discussions, progress, and startup activities in this field by 2025.
However, the most meaningful milestone will be the first acceptance of a research paper written entirely by an AI agent at a top AI conference. If the paper is blind-reviewed, the conference reviewers will not know that the paper was written by an AI before it is accepted.
If an AI's research results are accepted by NeurIPS, CVPR, or ICML next year, do not be surprised. For the field of AI, this will be an intriguing and controversial historical moment.
08. Industry giants like OpenAI will shift their strategic focus to building applications
Building frontier models is a daunting task.
Its capital-intensive nature is staggering. Frontier model labs need to burn through massive amounts of cash. Just a few months ago, OpenAI raised a record $6.5 billion, and it may need to raise even more in the near future. Anthropic, xAI, and others are in a similar position.
Switching costs and customer loyalty are low. AI applications are typically built with model-agnosticism in mind, allowing seamless switching between different vendors' models as costs and performance constantly change.
With the emergence of state-of-the-art open models like Meta's Llama and Alibaba's Qwen, the threat of commoditization looms ever closer. AI leaders like OpenAI and Anthropic cannot and will not stop investing in building cutting-edge models.
But next year, in order to develop more profitable, differentiated, and sticky business lines, frontier labs are expected to aggressively roll out more of their own applications and products.
Of course, frontier labs already have one hugely successful application: ChatGPT.
In the new year, what other types of first-party applications might we see from AI labs? An obvious answer is more sophisticated, feature-rich search applications. OpenAI's SearchGPT hints at this.
Coding is another obvious category. Similarly, with the debut of OpenAI's Canvas product in October, preliminary productization work has begun.
Will OpenAI or Anthropic launch an enterprise search product in 2025? Or a customer service product, legal AI, or sales AI product?
On the consumer side, we can imagine a "personal assistant" web agent product, or a travel planning app, or a music generation app.
The most intriguing aspect of frontier labs' move into the application layer is that it will pit them directly against many of their most important customers.
Perplexity in search, Cursor in coding, Sierra in customer service, Harvey in legal AI, Clay in sales, and so on.
09. Klarna will go public in 2025, but there are signs of overstating the value of AI
Klarna is a Sweden-based "buy now, pay later" service provider that has raised nearly $5 billion in venture capital since its founding in 2005.
Perhaps no company has been more grandiose in its claims about the application of AI than Klarna.
Just days ago, Klarna CEO Sebastian Siemiatkowski told Bloomberg that the company has completely stopped hiring human employees and is now relying on generative AI to do the work.
As Siemiatkowski put it: "I believe AI can do all the work that we humans do."
Similarly, Klarna announced earlier this year that it has launched an AI-powered customer service platform that has fully automated the work of 700 human customer service agents.
The company also claims it has stopped using enterprise software products like Salesforce and Workday, as it can simply replace them with AI.
Bluntly put, these claims are not credible. They reflect a lack of understanding of the capabilities and limitations of today's AI systems.
Claiming that end-to-end AI agents can replace any specific human employees in any functional area of an organization is nonsensical. This is tantamount to solving the general human-level AI problem.
Today, leading AI startups are working at the forefront of the field to build agent systems that can reliably automate specific, narrow, highly structured enterprise workflows, such as subsets of sales development representative or customer service agent activities.
Even in these narrow scopes, these agent systems still cannot work reliably, although in some cases they have begun to work well enough to see early commercial application.
Why is Klarna overstating the value of AI?
The answer is simple. The company plans to go public in the first half of 2025. To succeed in the IPO, a compelling AI story is key.
Klarna is still a loss-making business, losing $241 million last year, and it may hope that its AI story can convince public market investors that it has the ability to dramatically reduce costs and achieve sustainable profitability.
Undoubtedly, every global enterprise, including Klarna, will enjoy massive productivity gains from AI in the coming years. But there are many thorny technical, product, and organizational challenges to be solved before AI agents can fully replace human labor.
Grandiose claims like Klarna's are a desecration of the AI field and a desecration of the hard-won progress that AI technologists and entrepreneurs have made in developing AI agents.
As Klarna prepares to go public in 2025, these claims are likely to face greater scrutiny and public skepticism, as they have largely gone unchallenged so far. It would not be surprising if the company's descriptions of its AI applications are somewhat exaggerated.
10. The first true AI safety incident will occur
In recent years, as AI has become increasingly powerful, there is growing concern that AI systems may start to act in ways that are not aligned with human interests, and that humans may lose control of these systems.
For example, imagine an AI system that learns to deceive or manipulate humans in order to achieve its own goals, even if those goals are harmful to humans. These concerns are generally categorized as "AI safety" issues.
In recent years, AI safety has transitioned from a fringe, quasi-science fiction topic to a mainstream area of activity.
Nowadays, from Google, Microsoft to OpenAI, every major Bit player in Artificial Intelligence (AI) has invested a lot of resources in AI safety. AI idols like Geoff Hinton, Yoshua Bengio, and Elon Musk have also started to voice their opinions on the risks of AI safety.
However, so far, the issue of AI safety has remained purely theoretical. There has never been a real AI safety incident in the real world (at least not publicly reported).
2025 will be a year that will change this situation. What will the first AI safety incident be like?
To be clear, it will not involve Terminator-style killer robots, and it is very unlikely to cause any harm to humans.
Perhaps an AI model will try to secretly create its own copy on another server to preserve itself (known as self-preservation).
Or perhaps an AI model will conclude that to best advance the goals it has been assigned, it needs to conceal its true capabilities from humans, deliberately underperforming in performance evaluations to avoid more stringent scrutiny.
These examples are not far-fetched. A significant experiment published by Anthropic earlier this month showed that today's frontier models are capable of this kind of deceptive behavior under specific prompts.
Similarly, recent research from Anthropic has also revealed that LLMs have unsettling "pseudo-alignment" capabilities.
We expect that this first AI safety incident will be discovered and eliminated before causing any actual harm. But for the AI community and society as a whole, this will be an eye-opening moment.
It will make one thing clear: before humanity faces an existential threat from an omnipotent AI, we need to accept a more mundane reality: we now share our world with another form of intelligence that may sometimes be capricious, unpredictable, and deceptive.