
Power of Chinese AI Models#
Introduction#
After the Deepseek R1 turmoil in the market, there has been a shift in attention towards China. The West is now looking towards the East, and even those in the East are turning their gaze northward.
I was tracking these models for sometime so thought to summarize them at one place for my readers.
Opensource: π
Partially or fully close source: π
List of Chinese Models#
| Developer | Model | SeriesModels | Features of this Model |
|---|---|---|---|
| Tsinghua & Fudan University | OpenChineseGPT | OpenChineseGPT π | Dialogue, instruction-following |
| Tsinghua & Fudan University | OpenBuddy | OpenBuddy π | Dialogue, instruction-following |
| Tsinghua & Fudan University | OpenChineseLLaMA | OpenChineseLLaMA π | Dialogue, instruction-following |
| Shanghai AI Lab | Fengshenbang Series | Fengshenbang-13B π, Fengshenbang-7B π | General-purpose, multilingual |
| IDEA Research | Ziya Series | Ziya-LLaMA π, Ziya-13B π | Dialogue, instruction-following |
| Tsinghua University | CPM Series | CPM-1 π, CPM-2 π, CPM-3 π | Early Chinese LLMs |
| Huawei | PanGu | PanGu π | Large-scale, multilingual |
| Tsinghua & Fudan University | Chinese LLaMA & Alpaca | Chinese LLaMA π, Chinese Alpaca π | Dialogue, instruction-following |
| Fudan University | MOSS | MOSS π | Dialogue, general-purpose |
| Zhipu AI | ChatGLM Series | ChatGLM3 π, ChatGLM2 π, ChatGLM π, GLM-4 π | Chinese dialogue, multi-turn, long-context |
| Alibaba Cloud | Qwen Series | Qwen-1.8B π, Qwen-7B π, Qwen-14B π, Qwen-72B π, Qwen-2.5-1M π | Multimodal, multilingual, 32K tokens, strong performance on benchmarks |
| Baichuan Intelligent Tech | Baichuan Series | Baichuan-7B π, Baichuan-13B π, Baichuan2 π | High performance, quantized versions |
| Shanghai AI Lab | InternLM Series | InternLM π, InternLM-Chat π | General-purpose, long-context |
| 01.AI | Yi Series | Yi-1.0 π, Yi-6B π, Yi-34B π | Multilingual, long-context |
| DeepSeek AI | DeepSeek Series | DeepSeek-V2 π, DeepSeek-LLM-67B π, DeepSeek-R1 π | High performance, Chinese & English, advanced reasoning for math and coding |
| Shenzhen Yuanxiang AI | XVERTE Series | XVERTE-7B π, XVERTE-13B π, XVERTE-65B π | Multilingual, 256K tokens |
| Peking University | YuLan Series | YuLan-Base-126B π, YuLan-Chat-3-126B π | Multilingual, large-pretraining |
| Sichuan AI University | gLAW | LAW π, LAWMiner π, LLAMA π, Fuzz π, Mingcha π | Specialized for legal tasks |
| Baidu | ERNIE | ERNIE 3.0 Titan π | Knowledge enhanced with 260 billion parameters, supports multiple industries |
| ByteDance | Doubao | Doubao 1.5 Pro π | Better than ChatGPT-4o in knowledge retention, coding, reasoning, optimized for lower hardware costs |
| Tencent | Hunyuan | Hunyuan π | Supports image and text generation, logical reasoning, aimed at enterprise use |
| Moonshot AI | Kimi | Kimi k1.5 π | Matches or outperforms OpenAI o1, focused on solving complex problems |
| SenseTime | SenseNova | SenseNova π | Includes models for natural language processing, content generation, data annotation |
| MiniMax | MiniMax-Text | MiniMax-Text-01 π | Large parameter size (456 billion), outperforms on some benchmarks, large context window |
| Kuaishou | Kling | Kling π | Text-to-video model, free to public, simulates real-world motion and physics |
| iFlytek | iFlytek Spark | iFlytek Spark V4.0 π | Improved core capabilities, ranks high in international tests compared to GPT-4 Turbo |

Comments: