Power of Chinese AI Models
Power of Chinese AI ModelsPermalink
IntroductionPermalink
After the Deepseek R1 turmoil in the market, there has been a shift in attention towards China. The West is now looking towards the East, and even those in the East are turning their gaze northward.
I was tracking these models for sometime so thought to summarize them at one place for my readers.
Opensource: 🚀
Partially or fully close source: 🔒
List of Chinese ModelsPermalink
Developer | Model | SeriesModels | Features of this Model |
---|---|---|---|
Tsinghua & Fudan University | OpenChineseGPT | OpenChineseGPT 🚀 | Dialogue, instruction-following |
Tsinghua & Fudan University | OpenBuddy | OpenBuddy 🚀 | Dialogue, instruction-following |
Tsinghua & Fudan University | OpenChineseLLaMA | OpenChineseLLaMA 🚀 | Dialogue, instruction-following |
Shanghai AI Lab | Fengshenbang Series | Fengshenbang-13B 🚀, Fengshenbang-7B 🚀 | General-purpose, multilingual |
IDEA Research | Ziya Series | Ziya-LLaMA 🚀, Ziya-13B 🚀 | Dialogue, instruction-following |
Tsinghua University | CPM Series | CPM-1 🚀, CPM-2 🚀, CPM-3 🚀 | Early Chinese LLMs |
Huawei | PanGu | PanGu 🔒 | Large-scale, multilingual |
Tsinghua & Fudan University | Chinese LLaMA & Alpaca | Chinese LLaMA 🚀, Chinese Alpaca 🚀 | Dialogue, instruction-following |
Fudan University | MOSS | MOSS 🚀 | Dialogue, general-purpose |
Zhipu AI | ChatGLM Series | ChatGLM3 🚀, ChatGLM2 🚀, ChatGLM 🚀, GLM-4 🚀 | Chinese dialogue, multi-turn, long-context |
Alibaba Cloud | Qwen Series | Qwen-1.8B 🚀, Qwen-7B 🚀, Qwen-14B 🚀, Qwen-72B 🚀, Qwen-2.5-1M 🚀 | Multimodal, multilingual, 32K tokens, strong performance on benchmarks |
Baichuan Intelligent Tech | Baichuan Series | Baichuan-7B 🚀, Baichuan-13B 🚀, Baichuan2 🚀 | High performance, quantized versions |
Shanghai AI Lab | InternLM Series | InternLM 🚀, InternLM-Chat 🚀 | General-purpose, long-context |
01.AI | Yi Series | Yi-1.0 🚀, Yi-6B 🚀, Yi-34B 🚀 | Multilingual, long-context |
DeepSeek AI | DeepSeek Series | DeepSeek-V2 🚀, DeepSeek-LLM-67B 🚀, DeepSeek-R1 🚀 | High performance, Chinese & English, advanced reasoning for math and coding |
Shenzhen Yuanxiang AI | XVERTE Series | XVERTE-7B 🚀, XVERTE-13B 🚀, XVERTE-65B 🚀 | Multilingual, 256K tokens |
Peking University | YuLan Series | YuLan-Base-126B 🚀, YuLan-Chat-3-126B 🚀 | Multilingual, large-pretraining |
Sichuan AI University | gLAW | LAW 🚀, LAWMiner 🚀, LLAMA 🚀, Fuzz 🚀, Mingcha 🚀 | Specialized for legal tasks |
Baidu | ERNIE | ERNIE 3.0 Titan 🔒 | Knowledge enhanced with 260 billion parameters, supports multiple industries |
ByteDance | Doubao | Doubao 1.5 Pro 🔒 | Better than ChatGPT-4o in knowledge retention, coding, reasoning, optimized for lower hardware costs |
Tencent | Hunyuan | Hunyuan 🔒 | Supports image and text generation, logical reasoning, aimed at enterprise use |
Moonshot AI | Kimi | Kimi k1.5 🔒 | Matches or outperforms OpenAI o1, focused on solving complex problems |
SenseTime | SenseNova | SenseNova 🔒 | Includes models for natural language processing, content generation, data annotation |
MiniMax | MiniMax-Text | MiniMax-Text-01 🔒 | Large parameter size (456 billion), outperforms on some benchmarks, large context window |
Kuaishou | Kling | Kling 🔒 | Text-to-video model, free to public, simulates real-world motion and physics |
iFlytek | iFlytek Spark | iFlytek Spark V4.0 🔒 | Improved core capabilities, ranks high in international tests compared to GPT-4 Turbo |