2 minute read

Power of Chinese AI Models

Power of Chinese AI ModelsPermalink

IntroductionPermalink

After the Deepseek R1 turmoil in the market, there has been a shift in attention towards China. The West is now looking towards the East, and even those in the East are turning their gaze northward.

I was tracking these models for sometime so thought to summarize them at one place for my readers.

Opensource: 🚀

Partially or fully close source: 🔒

List of Chinese ModelsPermalink

Developer Model SeriesModels Features of this Model
Tsinghua & Fudan University OpenChineseGPT OpenChineseGPT 🚀 Dialogue, instruction-following
Tsinghua & Fudan University OpenBuddy OpenBuddy 🚀 Dialogue, instruction-following
Tsinghua & Fudan University OpenChineseLLaMA OpenChineseLLaMA 🚀 Dialogue, instruction-following
Shanghai AI Lab Fengshenbang Series Fengshenbang-13B 🚀, Fengshenbang-7B 🚀 General-purpose, multilingual
IDEA Research Ziya Series Ziya-LLaMA 🚀, Ziya-13B 🚀 Dialogue, instruction-following
Tsinghua University CPM Series CPM-1 🚀, CPM-2 🚀, CPM-3 🚀 Early Chinese LLMs
Huawei PanGu PanGu 🔒 Large-scale, multilingual
Tsinghua & Fudan University Chinese LLaMA & Alpaca Chinese LLaMA 🚀, Chinese Alpaca 🚀 Dialogue, instruction-following
Fudan University MOSS MOSS 🚀 Dialogue, general-purpose
Zhipu AI ChatGLM Series ChatGLM3 🚀, ChatGLM2 🚀, ChatGLM 🚀, GLM-4 🚀 Chinese dialogue, multi-turn, long-context
Alibaba Cloud Qwen Series Qwen-1.8B 🚀, Qwen-7B 🚀, Qwen-14B 🚀, Qwen-72B 🚀, Qwen-2.5-1M 🚀 Multimodal, multilingual, 32K tokens, strong performance on benchmarks
Baichuan Intelligent Tech Baichuan Series Baichuan-7B 🚀, Baichuan-13B 🚀, Baichuan2 🚀 High performance, quantized versions
Shanghai AI Lab InternLM Series InternLM 🚀, InternLM-Chat 🚀 General-purpose, long-context
01.AI Yi Series Yi-1.0 🚀, Yi-6B 🚀, Yi-34B 🚀 Multilingual, long-context
DeepSeek AI DeepSeek Series DeepSeek-V2 🚀, DeepSeek-LLM-67B 🚀, DeepSeek-R1 🚀 High performance, Chinese & English, advanced reasoning for math and coding
Shenzhen Yuanxiang AI XVERTE Series XVERTE-7B 🚀, XVERTE-13B 🚀, XVERTE-65B 🚀 Multilingual, 256K tokens
Peking University YuLan Series YuLan-Base-126B 🚀, YuLan-Chat-3-126B 🚀 Multilingual, large-pretraining
Sichuan AI University gLAW LAW 🚀, LAWMiner 🚀, LLAMA 🚀, Fuzz 🚀, Mingcha 🚀 Specialized for legal tasks
Baidu ERNIE ERNIE 3.0 Titan 🔒 Knowledge enhanced with 260 billion parameters, supports multiple industries
ByteDance Doubao Doubao 1.5 Pro 🔒 Better than ChatGPT-4o in knowledge retention, coding, reasoning, optimized for lower hardware costs
Tencent Hunyuan Hunyuan 🔒 Supports image and text generation, logical reasoning, aimed at enterprise use
Moonshot AI Kimi Kimi k1.5 🔒 Matches or outperforms OpenAI o1, focused on solving complex problems
SenseTime SenseNova SenseNova 🔒 Includes models for natural language processing, content generation, data annotation
MiniMax MiniMax-Text MiniMax-Text-01 🔒 Large parameter size (456 billion), outperforms on some benchmarks, large context window
Kuaishou Kling Kling 🔒 Text-to-video model, free to public, simulates real-world motion and physics
iFlytek iFlytek Spark iFlytek Spark V4.0 🔒 Improved core capabilities, ranks high in international tests compared to GPT-4 Turbo

Updated: