Power of Chinese AI Models
Power of Chinese AI ModelsPermalink
IntroductionPermalink
After Deepseek R1 bloodbath in the market now people started paying more attention towards China and west started looking towards east, including people in east are looking towards North.
I was tracking these models for sometime so thought to summarize them at one place for my readers.
Opensource: 🚀
Partially or fully close source: 🔒
List of Chinese ModelsPermalink
Developer | Model | SeriesModels | IncludedFeatures |
---|---|---|---|
Tsinghua & Fudan University | OpenChineseGPT | OpenChineseGPT 🚀 | Dialogue, instruction-following |
Tsinghua & Fudan University | OpenBuddy | OpenBuddy 🚀 | Dialogue, instruction-following |
Tsinghua & Fudan University | OpenChineseLLaMA | OpenChineseLLaMA 🚀 | Dialogue, instruction-following |
Shanghai AI Lab | Fengshenbang Series | Fengshenbang-13B 🚀, Fengshenbang-7B 🚀 | General-purpose, multilingual |
IDEA Research | Ziya Series | Ziya-LLaMA 🚀, Ziya-13B 🚀 | Dialogue, instruction-following |
Tsinghua University | CPM Series | CPM-1 🚀, CPM-2 🚀, CPM-3 🚀 | Early Chinese LLMs |
Huawei | PanGu | PanGu 🔒 | Large-scale, multilingual |
Tsinghua & Fudan University | Chinese LLaMA & Alpaca | Chinese LLaMA 🚀, Chinese Alpaca 🚀 | Dialogue, instruction-following |
Fudan University | MOSS | MOSS 🚀 | Dialogue, general-purpose |
Zhipu AI | ChatGLM Series | ChatGLM3 🚀, ChatGLM2 🚀, ChatGLM 🚀, GLM-4 🚀 | Chinese dialogue, multi-turn, long-context |
Alibaba Cloud | Qwen Series | Qwen-1.8B 🚀, Qwen-7B 🚀, Qwen-14B 🚀, Qwen-72B 🚀, Qwen-2.5-1M 🚀 | Multimodal, multilingual, 32K tokens, strong performance on benchmarks |
Baichuan Intelligent Tech | Baichuan Series | Baichuan-7B 🚀, Baichuan-13B 🚀, Baichuan2 🚀 | High performance, quantized versions |
Shanghai AI Lab | InternLM Series | InternLM 🚀, InternLM-Chat 🚀 | General-purpose, long-context |
01.AI | Yi Series | Yi-1.0 🚀, Yi-6B 🚀, Yi-34B 🚀 | Multilingual, long-context |
DeepSeek AI | DeepSeek Series | DeepSeek-V2 🚀, DeepSeek-LLM-67B 🚀, DeepSeek-R1 🚀 | High performance, Chinese & English, advanced reasoning for math and coding |
Shenzhen Yuanxiang AI | XVERTE Series | XVERTE-7B 🚀, XVERTE-13B 🚀, XVERTE-65B 🚀 | Multilingual, 256K tokens |
Peking University | YuLan Series | YuLan-Base-126B 🚀, YuLan-Chat-3-126B 🚀 | Multilingual, large-pretraining |
Sichuan AI University | gLAW | LAW 🚀, LAWMiner 🚀, LLAMA 🚀, Fuzz 🚀, Mingcha 🚀 | Specialized for legal tasks |
Baidu | ERNIE | ERNIE 3.0 Titan 🔒 | Knowledge enhanced with 260 billion parameters, supports multiple industries |
ByteDance | Doubao | Doubao 1.5 Pro 🔒 | Better than ChatGPT-4o in knowledge retention, coding, reasoning, optimized for lower hardware costs |
Tencent | Hunyuan | Hunyuan 🔒 | Supports image and text generation, logical reasoning, aimed at enterprise use |
Moonshot AI | Kimi | Kimi k1.5 🔒 | Matches or outperforms OpenAI o1, focused on solving complex problems |
SenseTime | SenseNova | SenseNova 🔒 | Includes models for natural language processing, content generation, data annotation |
MiniMax | MiniMax-Text | MiniMax-Text-01 🔒 | Large parameter size (456 billion), outperforms on some benchmarks, large context window |
Kuaishou | Kling | Kling 🔒 | Text-to-video model, free to public, simulates real-world motion and physics |
iFlytek | iFlytek Spark | iFlytek Spark V4.0 🔒 | Improved core capabilities, ranks high in international tests compared to GPT-4 Turbo |