While OpenAI is still debating advertising monetization and the ideal of AGI, Apple and Google are already planning to "flank and steal" from behind.
On January 24th, Apple officially announced its most disruptive move since the birth of Siri: investing $1 billion in a deep collaboration with its former rival Google to completely revamp Siri with a customized Gemini engine. This upgrade will boost Siri's parameters from 150 billion to 1.2 trillion, transforming it into a truly intelligent AI agent capable of multi-turn deep dialogue and cross-app operation.
This new alliance has ushered in a new dimension to the competition among AI assistants. Siri's radical overhaul is not the end, but the true starting point of the final battle in mobile AI, thus beginning a direct confrontation between Apple, Google, and domestic players. This time, domestic players not only stand on the same starting line as global tech giants, but also have the opportunity to sprint faster and cross the finish line first.
Apple's "AI tutoring": Money power will never go out of style
Apple's AI strategy can be summarized in one sentence: "It's good to be rich."
GPT-4.1 supports context windows with up to 1 million tokens, equivalent to approximately 750,000 words. This means it can conduct long, in-depth, and logically coherent conversations, accurately remembering user preferences and previous discussions.
Compared to newer generation intelligent agents like GPT-4 and Claude, the biggest embarrassment of the older version of Siri over the past few years has been its weak natural language understanding and task execution capabilities. It cannot handle complex commands containing multiple intentions, let alone understand context for multi-turn conversations.
Catching such a huge gap from scratch would not only require astronomical R&D costs, but more importantly, it would miss a precious window of opportunity. Therefore, Apple chose the most pragmatic strategy of "buying time with money," investing $1 billion to integrate the world's leading Gemini large-scale model into its own system.
However, Apple, which supplemented Siri's "intelligence" with large models, did not simply turn it into a megaphone for Gemini. Instead, it bet on privacy and security, further strengthening its "ecosystem walls" and using a closed ecosystem to accelerate technological development.
Addressing users' primary concern about privacy, Apple has pledged to strictly limit the scope of long-term user memory that the new Siri can retain, with a focus on strengthening local data processing and access control. Gemini runs entirely on Apple's private cloud, user requests are tagged, and Google only acts as a technology provider offering model inference capabilities, without access to specific data or its use for training.
It's important to understand that as generative AI continues to iterate and evolve, from films and television shows like "Minority Report" and "Trial by Fire" to real-world cases such as AI face-swapping scams and Samsung employees using ChatGPT leading to the leakage of confidential information, the controversy and reflection on privacy have never ceased.
This multi-layered collaboration, ensuring robust protection of user privacy, perfectly upholds Apple's established image as a "privacy guardian." Against the backdrop of growing anxiety about data security, the value of this strategy is becoming increasingly apparent.
Another major advantage of the upgraded Siri is its deep integration of powerful AI capabilities with Apple's seamless, multi-scenario ecosystem. According to the cooperation agreement, the upgraded Siri will achieve unprecedented depth of integration with Apple's entire suite of operating systems, including iOS, macOS, and watchOS.
Apple's core strength in maintaining its market dominance lies in locking users into its vast ecosystem through a closed loop of "hardware + services." With the addition of Gemini, users of phones, computers, and watches will enjoy a more consistent and seamless smart experience, greatly enhancing the "stickiness" of the iOS ecosystem.
From upholding user privacy to a comprehensive ecosystem upgrade, iOS, already the "king of stickiness" operating system, has once again gained the confidence to not be overtaken by AI with the help of Gemini.
It's worth noting that Apple's latest financial report saw record highs in several areas, including global revenue, net profit, and iPhone sales. Apple's revenue in Greater China this quarter reached $25.5 billion, a surge of 38%, and its service business, represented by the "Apple tax," achieved a record high gross margin of 77%, becoming Apple's new cash cow.
Prioritizing both efficiency and compliance, Apple, Alibaba, and Tencent lead the A2A (Agency to Agent) model.
However, hidden within Apple's seemingly flawless strategic layout lies an almost unavoidable problem—the Chinese market.
Due to differences in data security regulations and internet regulatory policies, Google's full services have long been absent in mainland China. This means that the new Siri on mainland China versions of iPhones, Macs, and other devices will most likely not be able to directly call the full version of the Gemini model provided by Google, but will most likely adopt a compromise solution of "self-developed model + adaptation by domestic manufacturers".
While overseas users are already enjoying the convenience of the new Siri, such as one-click ticket booking across apps, intelligent travel planning, and automatic photo organization, domestic users are likely to be facing a "crippled" version of Siri with stripped-down functions and reduced capabilities.
The huge gap in user experience has opened up a golden opportunity for astute domestic mobile phone manufacturers.
In the era of mobile AI, the core criterion for judging the quality of AI assistants is no longer how beautiful the poems they can write, but how much they can "free up the user's hands," and more importantly, how safely and without sacrificing privacy can they complete real-world tasks.
In this regard, some domestic players have already demonstrated foresight and execution. What they are exploring is the Agent to Agent (A2A) model, which coincides with the approach taken by Apple and Google and is widely regarded by the industry as the future direction.
The core idea of the A2A model is to have the AI assistant act as a "general dispatcher." Each app encapsulates its core functions into independent, standardized "sub-intelligent agents." When a user issues a command, the main AI agent is responsible for understanding the intent, breaking down the task, and dispatching work orders to the corresponding app sub-intelligent agents through a unified, authorized API interface to collaboratively complete the task.
Alibaba's "Qianwen" has begun in-depth exploration of the A2A model. Users only need to express their needs in the dialog box, and Qianwen can immediately understand the multiple intentions behind the needs. Then, it will dispatch "Fliggy" to search and book tickets and hotels, call "Gaode Map" to plan the trip route, and even link with "Taobao" to recommend and purchase the items needed by the user.
The entire process was smooth and precise, and it was completed within the secure and controllable framework of the Alibaba ecosystem.
The advantage of this model lies in its establishment of a clear dual authorization mechanism—users need to explicitly authorize the AI assistant to call which app functions; app developers also clearly define through API interfaces which capabilities can be called externally, the frequency of calls, and the data range.
Every AI operation is traceable and clearly defined in terms of rights and responsibilities. It has also built a symbiotic ecosystem with a much more synergistic effect than the app store model, bringing new traffic entrances and business models to app developers.
During the Q3 2025 earnings call, Tencent President Martin Lau stated that WeChat would also launch an AI agent. Leveraging Tencent's vast ecosystem, once WeChat Agent enters the market, it will likely be able to manage a massive number of WeChat mini-programs and services like Didi and Tongcheng, completing a one-stop closed loop from social interaction and travel to local services.
However, not all players have chosen the A2A route, which is safe but requires patience, on the road to AI phones. Some manufacturers have also explored the system-level GUI route of AI directly "reading the screen".
Represented by manufacturers like ByteDance's Doubao phone launched in collaboration with ZTE and Meizu, which advocate the concept of "visual integration," this model's logic is to obtain extremely high permissions at the system's underlying level, allowing AI to "read" the text and images on the screen like a real person, and then "simulate" human fingers to click, swipe, and input, thereby operating any app on the phone.
This approach bypasses the lengthy communication, coordination, and interface adaptation process with app developers, and is theoretically compatible with all existing apps, allowing users to quickly experience the cool effect of "getting everything done in one sentence".
However, this "speed" comes at the cost of sacrificing user privacy and security. When the AI assistant needs to "read the screen," it means that all content displayed on the screen, such as the user's chat history, payment password input interface, and private photos, will be exposed to its "view" without reservation.
Although manufacturers promise that data will not be processed in the cloud or will be processed locally, such promises seem weak in the face of opaque technological black boxes. This is why the launch of the Doubao phone quickly sparked controversy and resistance within the industry, with major banks' financial apps immediately taking technical measures to block such simulated operations.
After all, no responsible platform would allow an unauthorized third party to act arbitrarily on its application interface.
This collaboration between Apple and Google demonstrates to all mobile phone manufacturers that introducing powerful big data model capabilities does not equate to ignoring rules or trampling on privacy. The AI ecosystem must be built on a foundation of respect, cooperation, and mutual benefit. This is both a protection of user asset security and privacy, and a power struggle between big data model manufacturers and mobile phone manufacturers.
While GPT's leading advantage may not be reflected in the commercial realm at present, its support for mobile AI assistants still has a good chance of becoming a "super gateway" in the AI era.
The core scenarios for GPT-type products are deep thinking activities such as coding, report writing, and creative work, requiring complex reasoning capabilities based on large models. However, ultimately, they are still "cold, impersonal tools" that cannot be integrated into daily life. In contrast, mobile assistants are closer to users psychologically, and their core competitiveness lies in their efficiency of "instant response" and their thoughtfulness of "understanding users best."
The story of Apple spending $1 billion to "buy a god-tier device" also tells us that the future AI competition will never be a "monopoly" by a single model company, but a comprehensive contest of "model capabilities + ecosystem integration + user trust".
Instead of waiting for a "crippled" version of Siri to be released, Chinese mobile phone manufacturers should take the initiative to cooperate deeply with internet giants such as Alibaba, Tencent, and Meituan, which control local life scenarios, and integrate with domestic mobile phone assistants through a safe and standardized A2A model to create a super AI assistant that truly "understands Chinese users".
At that time, it will form a one-stop service loop by integrating payment, social networking, travel and other application capabilities, and will be able to compete with Apple's ecosystem barriers and GPT's general capabilities.
This article is from the WeChat public account "Mingxi Yewang" , author: Luo Su, and published with authorization from 36Kr.




