Domestic Six Major Reasoning Models Battle OpenAI

04/25 2025 502

"DeepSeek-R1 is akin to the Soviet Union's first satellite, marking the Sputnik moment for AI, heralding a new era."

Before the 2025 Spring Festival, DeepSeek blossomed in the world's sky amidst the New Year's Eve fireworks.

Just hours before the New Year's Eve dinner, an engineer from a domestic cloud server company was abruptly pulled into a workgroup, tasked with urgently optimizing chips to adapt to the latest DeepSeek-R1 model. The engineer shared, "From initiation to completion, the entire process spanned less than a week."

On the second day of the Lunar New Year, the phone of a head of a company specializing in Agent To B business rang incessantly. Customers' demands were straightforward: verify the model's true performance promptly and expedite deployment.

Before the holiday, various large models existed, but post-holiday, DeepSeek reigned supreme. DeepSeek-R1 serves as a watershed, rewriting the narrative logic of China's large models.

Starting from November 2022, when OpenAI released the ChatGPT application based on GPT-3.5, China embarked on the path of catching up with OpenAI. In 2023, large models sprouted like mushrooms after rain, making AI unimaginable without them. Vendors hotly pursued each other, and the battle of hundreds of models began to emerge.

Successively, the protagonists of 2024 became the "AI Six Tigers," and AI entrepreneurship became the new storyline. In just one year, WisdomMatrix completed cumulative financing of 4 billion yuan, and Dark Side of the Moon's total financing exceeded 1.3 billion US dollars. Capital's embrace turned them into star unicorn companies under the spotlight.

A new turning point occurred after the explosion of DeepSeek-R1, and for a time, the industry found itself in a situation of "half flame, half sea water," actively embracing R1 while deeply introspecting.

The hesitation was fleeting. As Baidu, Alibaba, ByteDance, Tencent, iFLYTEK, and other vendors successively released their latest reasoning models, the theme of AI narrative in 2025 emerged: "Six reasoning models confront OpenAI."

Prime Time for Reasoning Models

Reviewing OpenAI's model release timeline, in the realm of basic models, it can be divided into the GPT series and the o series. The o1 released by OpenAI in 2024 marked a milestone turning point.

(Photon Planet Chart)

The GPT series is the earliest model system constructed by OpenAI, focusing on natural language processing, dialog systems, and text generation, emphasizing language fluency and contextual understanding abilities. The o series, a new model family established by OpenAI in 2023, concentrates on "structured reasoning" abilities, emphasizing the model's logic, analysis, and tool invocation capabilities, serving as a supplement and expansion to the GPT series' "language-focused" approach.

In the future, the GPT series may gradually fade into history. OpenAI announced in the update log that, starting from April 30, 2025, GPT4 will be retired from ChatGPT and completely replaced by GPT4o.

If it were solely OpenAI's technology choice, the o series and DeepSeek-R1 would not have such a profound impact. Taking the underlying model architecture as an example, some companies opt for the traditional Transformer architecture, while others choose self-developed architectures.

The rise of the o series has a broader context, namely the paradigm shift in large models, transitioning from the Scaling Law of model parameters in the traditional pre-training phase to the new Scaling Law brought by reinforcement learning reasoning calculations. This was verified during the development of OpenAI's o3. OpenAI observed that large-scale reinforcement learning exhibits the same trends as GPT series pre-training: the greater the computation, the better the performance.

In essence, it enables AI to plan, learn, provide feedback, and complete tasks autonomously, aligning with the abilities required by today's popular Agents.

A technician told Photon Planet that the "Deep Research" Agent released after o1 is entirely trained from scratch based on the model and does not disclose the chain-of-thought reasoning process. "This implies that the base model's ability directly determines the Agent's landing effect." To become competitive in the second phase of large models, reasoning models have nearly become a necessity.

From the perspective of company and technology leaders, promptly following up on o1 and DeepSeek-R1 is a testament to judgment and vision but also signifies significant investment and high risk.

We understand that many domestic companies nominally possess self-developed large models but are actually "shells." The o series emerged on the shoulders of GPT, deterring companies with unstable foundations. Additionally, the pressure of financing and commercialization has eliminated a batch of companies.

(Photon Planet Chart)

Thus, we find that the once dimly lit major players became the fastest responders and the most timely followers.

Taking DeepSeek-R1 (released on January 20, 2025) as the time baseline, iFLYTEK released the deep reasoning large model, iFLYTEK Spark X1, that same month; in March, Baidu released ERNIE Bot X1, Alibaba released Tongyi Qianwen Qwen-QwQ-32B reasoning model, and Tencent released Kunyuan T1 deep thinking model; in April, ByteDance's Doubao 1.5 deep thinking model was launched, and simultaneously, iFLYTEK Spark X1 was upgraded to release the "Unified Model of Fast and Slow Thinking".

The above vendors share commonalities. They have kept pace with every model capability upgrade. Before shifting to the reasoning direction, their basic model capabilities essentially reached the level of GPT-4. Taking this as a reference, this may be the prerequisite for entering the second phase of large models.

Six Major Reasoning Models Scramble with o3

o3 is currently OpenAI's most powerful reasoning model. An online large model IQ chart indicates that the average human IQ is 100, while o3's IQ soars to a staggering 136.

Test data reveals that o3 surpasses o1's performance in multiple benchmark tests, particularly in visual tasks like analyzing images, charts, and graphs.

In external expert assessments, o3 makes 20% fewer significant errors than o1 in challenging real-world tasks and excels in areas such as programming, business, consulting, and creative conception.

It must be acknowledged that OpenAI indeed has some tricks up its sleeve. After o1, o3 has become a new peak in large model performance. However, domestic major model vendors' follow-up speed is not sluggish. If we consider DeepSeek-R1 as the reference standard, the reasoning models released by Baidu, Alibaba, iFLYTEK, ByteDance, and Tencent are not significantly different in level, and some even surpass them in certain test indicators.

As of now, the six domestic reasoning models each boast their own strengths.

The significance of DeepSeek-R1 is self-evident. The comprehensive technical report and open-source deployment have provided the industry with a training approach for large reasoning models. It opened the "black box" of OpenAI's closed source and successfully replicated o1 with comparable performance. R1's prominent feature is "doing great things with little money," being efficient and pursuing extreme cost-effectiveness. With very limited investment in computing power, data, and other resources, the training cost is only 5.6 million US dollars, far lower than the tens of millions or even hundreds of millions of US dollars invested by American AI companies.

An insider informed us that DeepSeek-R1 and some domestic reasoning large models do not constitute direct competitors. In B-end businesses, the currently open-sourced Qianwen series models from Alibaba occupy a heavier proportion. "Full-size and full-model, like a family bucket, can be selected by customers. The cost of running a 32B model is not very high either."

Baidu accessed DeepSeek from the ecological level in this wave, offering users more choices. The open-source and free strategy may attract more users. ERNIE Bot X1 adopts "chain-of-thought - action chain" collaborative training, automatically breaking down complex tasks into over 20 reasoning steps and invoking more than a dozen tool chains to enhance the Agent's capabilities.

Someone who has participated in cooperation with Baidu told Photon Planet that in vertical fields such as finance, healthcare, and government affairs, Baidu will "play matchmaker" and bring together companies with related businesses. "Baidu provides the base model, we provide the technology needed by the other party, and finally, we directly settle accounts with Baidu." In this manner, Baidu is continuously narrowing the gap with iFLYTEK in the To B large model market.

iFLYTEK's Spark X1 is currently the only deep reasoning large model trained based on fully domestic computing power.

Leveraging the advantages of full-stack localization and autonomy, iFLYTEK's Spark large model is favored by central and state-owned enterprises and government customers, maintaining a leading position in the industry. On April 21, the upgrade of Spark X1 enhanced its general capabilities and simultaneously strengthened its industry-oriented solution capabilities. In tests in key industries such as education, healthcare, and justice, it obtained scores surpassing those of OpenAI and DeepSeek, and these capabilities will undoubtedly be reflected in large model orders this year.

Spark X1 supports two thinking modes simultaneously, enhancing the model's ability to handle tasks of varying complexities. The full-blooded version of Spark X1 can be deployed with only 4 cards (Huawei 910B). The in-depth cooperation with Huawei, the continuously iterated base large model capabilities, and the powerful industry large model landing system have become the three major weapons for iFLYTEK to stand out amidst the siege of major players.

Among the domestic closed-source large models, the Doubao model is evaluated as "having a certain price competitiveness." A manufacturer of AI toys told us that his products are connected to multiple large models, and during user usage, they prioritize each company's free token quota. "Once exceeded, we prioritize switching to Doubao, and the price can be controlled at a relatively low cost."

Last year, Doubao participated in leading the price war, with the price of the Doubao large model dropping to 0.0008 yuan/1,000 tokens and the pricing of the Doubao visual understanding model at 0.003 yuan/1,000 tokens, both lower than the industry average at that time. Additionally, the Doubao large model serves as a case worth learning from for technology landing in AI application products. End-to-end real-time speech technology, multimodal, and Agent technologies can all intervene in Doubao's application end promptly, which is also one of the reasons supporting its rapid iteration and update.

Tencent's Kunyuan entered the market later. An employee once told us that most Kunyuan team members were previously from search recommendation advertising, and there may be a certain gap compared to Tongyi and ByteDance. "They were forced into it, and it seemed like there was no clear direction, going here and there," "a group of laymen guiding insiders." Coupled with the loss of personnel, Kunyuan was once in a state of stagnation.

Riding on the rising tide of DeepSeek, Kunyuan has quietly achieved a comeback. At least from a data perspective, it has achieved phased results. An insider informed us that in the months since the 2025 Spring Festival, Tencent has poured the entire group's resources into promoting Kunyuan, whether it is offline event resources, WeChat traffic diversion, or budget investment, all heavily inclined towards Kunyuan. Through this all-in approach, the previously completely passive situation has been reversed.

Judging from the current market feedback from various companies, cloud-based multimodal invocation has gradually gained recognition. The coexistence of various models and users' on-demand invocation will be the future. In reality, whether a customer ultimately chooses a large model or not, model performance is only one measurement standard, and it may also involve considerations such as data and ecology.

Will Large Models Be Fully Localized?

Since DeepSeek-R1, domestic reasoning large models have become regular guests on various rankings, with users in the AI open-source community supporting the development of Chinese AI through real downloads and Star counts.

Even so, current large models still face more or less "bottleneck" issues.

Recently, it was reported that NVIDIA has informally notified its AIC partners (such as Colorful, GALAXY, Tongde, etc.) to suspend the sale and shipment of GeForce RTX 5090D. This measure is considered a preventive measure by NVIDIA in response to changes in the international environment.

Although NVIDIA has not yet issued an official announcement, the industry generally believes that the supply of RTX 5090D has entered a "suspended state," and this is just the beginning.

If restricted at the source, NVIDIA will inevitably suffer even greater losses, and the development of large models in non-US countries will face uncertainties, hindering the pace of catching up with OpenAI to some extent.

Against this backdrop, the full-stack localization technology path will increasingly become an option for everyone. Among them, iFLYTEK has made relatively sufficient preparations. It is understood that iFLYTEK and its partners have jointly optimized four core technologies to double the reasoning performance of the MoE model cluster.

According to the latest test set evaluation results, Spark X1 fully benchmarks OpenAI o1 and DeepSeek R1 in general task performance evaluations, with outstanding performance in mathematics, knowledge Q&A, etc., indicating that on the path of technological autonomy and controllability, Chinese AI already possesses the strength to compete with international top models on the same stage.

Last year's dazzling AI Six Tigers have now diverged, each experiencing vastly contrasting fortunes. Dark Side of the Moon, whose "AGI ideal," "academic genius entrepreneurship," and "star AI product" were dismantled by DeepSeek, has retreated to low-profile technological research. MiniMax, after separating its core technology from products, has intensified its technological investments, focusing on Agent and reasoning models, similar to its former path. Among the Six Tigers, WisdomMatrix finally received the promising news of an impending IPO, although its total revenue, valuation, and the likelihood of a successful IPO remain shrouded in uncertainty.

Last year, AI applications like Kimi and Conch AI briefly brought AI companies into the limelight. However, this year, reasoning models have emerged as a key battleground for domestic heavyweights, and the AI Six Tigers' focus significantly overlaps with that of these major players. The "lifeblood" determining their survival lies firmly in the hands of these industry giants.

In the current landscape, with the ascendancy of six prominent reasoning models and the escalating uncertainty in the international environment, fully domesticated large models are poised to become the new norm.

From semiconductors, industrial software, and information innovation to today's AI chips, historical precedents underscore the importance of achieving independence and autonomy to escape current restrictions. It is conceivable that in the near future, more domestic large models will embark on a fully domesticated path, confronting giants like OpenAI head-on.

Solemnly declare: the copyright of this article belongs to the original author. The reprinted article is only for the purpose of spreading more information. If the author's information is marked incorrectly, please contact us immediately to modify or delete it. Thank you.