Skip to main content
  1. Data Science Blog/

Power of Chinese AI Models

·438 words·3 mins· loading · ·
AI/ML Models Technology Trends & Future Artificial Intelligence AI Models AI Industry Research Methods

Power of Chinese AI Models

Power of Chinese AI Models
#

Introduction
#

After the Deepseek R1 turmoil in the market, there has been a shift in attention towards China. The West is now looking towards the East, and even those in the East are turning their gaze northward.

I was tracking these models for sometime so thought to summarize them at one place for my readers.

Opensource: πŸš€

Partially or fully close source: πŸ”’

List of Chinese Models
#

DeveloperModelSeriesModelsFeatures of this Model
Tsinghua & Fudan UniversityOpenChineseGPTOpenChineseGPT πŸš€Dialogue, instruction-following
Tsinghua & Fudan UniversityOpenBuddyOpenBuddy πŸš€Dialogue, instruction-following
Tsinghua & Fudan UniversityOpenChineseLLaMAOpenChineseLLaMA πŸš€Dialogue, instruction-following
Shanghai AI LabFengshenbang SeriesFengshenbang-13B πŸš€, Fengshenbang-7B πŸš€General-purpose, multilingual
IDEA ResearchZiya SeriesZiya-LLaMA πŸš€, Ziya-13B πŸš€Dialogue, instruction-following
Tsinghua UniversityCPM SeriesCPM-1 πŸš€, CPM-2 πŸš€, CPM-3 πŸš€Early Chinese LLMs
HuaweiPanGuPanGu πŸ”’Large-scale, multilingual
Tsinghua & Fudan UniversityChinese LLaMA & AlpacaChinese LLaMA πŸš€, Chinese Alpaca πŸš€Dialogue, instruction-following
Fudan UniversityMOSSMOSS πŸš€Dialogue, general-purpose
Zhipu AIChatGLM SeriesChatGLM3 πŸš€, ChatGLM2 πŸš€, ChatGLM πŸš€, GLM-4 πŸš€Chinese dialogue, multi-turn, long-context
Alibaba CloudQwen SeriesQwen-1.8B πŸš€, Qwen-7B πŸš€, Qwen-14B πŸš€, Qwen-72B πŸš€, Qwen-2.5-1M πŸš€Multimodal, multilingual, 32K tokens, strong performance on benchmarks
Baichuan Intelligent TechBaichuan SeriesBaichuan-7B πŸš€, Baichuan-13B πŸš€, Baichuan2 πŸš€High performance, quantized versions
Shanghai AI LabInternLM SeriesInternLM πŸš€, InternLM-Chat πŸš€General-purpose, long-context
01.AIYi SeriesYi-1.0 πŸš€, Yi-6B πŸš€, Yi-34B πŸš€Multilingual, long-context
DeepSeek AIDeepSeek SeriesDeepSeek-V2 πŸš€, DeepSeek-LLM-67B πŸš€, DeepSeek-R1 πŸš€High performance, Chinese & English, advanced reasoning for math and coding
Shenzhen Yuanxiang AIXVERTE SeriesXVERTE-7B πŸš€, XVERTE-13B πŸš€, XVERTE-65B πŸš€Multilingual, 256K tokens
Peking UniversityYuLan SeriesYuLan-Base-126B πŸš€, YuLan-Chat-3-126B πŸš€Multilingual, large-pretraining
Sichuan AI UniversitygLAWLAW πŸš€, LAWMiner πŸš€, LLAMA πŸš€, Fuzz πŸš€, Mingcha πŸš€Specialized for legal tasks
BaiduERNIEERNIE 3.0 Titan πŸ”’Knowledge enhanced with 260 billion parameters, supports multiple industries
ByteDanceDoubaoDoubao 1.5 Pro πŸ”’Better than ChatGPT-4o in knowledge retention, coding, reasoning, optimized for lower hardware costs
TencentHunyuanHunyuan πŸ”’Supports image and text generation, logical reasoning, aimed at enterprise use
Moonshot AIKimiKimi k1.5 πŸ”’Matches or outperforms OpenAI o1, focused on solving complex problems
SenseTimeSenseNovaSenseNova πŸ”’Includes models for natural language processing, content generation, data annotation
MiniMaxMiniMax-TextMiniMax-Text-01 πŸ”’Large parameter size (456 billion), outperforms on some benchmarks, large context window
KuaishouKlingKling πŸ”’Text-to-video model, free to public, simulates real-world motion and physics
iFlytekiFlytek SparkiFlytek Spark V4.0 πŸ”’Improved core capabilities, ranks high in international tests compared to GPT-4 Turbo

Related

The AI Market Ecosystem
·1150 words·6 mins· loading
Artificial Intelligence Technology Trends & Future Societal Impact AI Industry AI Economics Technology Policy Market Analysis AI Ethics
The AI Market Ecosystem # Who the Players Are, Who Earns, Who Spends, and What It Means for Human …
Accuracy Is Not a Number: How Customers Misjudge AI Document Processing
·2628 words·13 mins· loading
Artificial Intelligence AI Applications Evaluation & Metrics Document AI OCR Enterprise AI Model Evaluation Accuracy Metrics
Accuracy Is Not a Number # How Customers Misjudge AI Document Processing Many enterprise AI …
Experimenting with Vertex AI: A Practical Guide from Account Setup to First Model Call
·4895 words·23 mins· loading
Cloud Computing Artificial Intelligence Language Models (LLMs) Vertex AI Google Cloud Platform Gemini GCP Vertex AI Studio Model Garden IAM MLOps
Experimenting with Vertex AI: A Practical Guide from Account Setup to First Model Call # 1. …
Cursor Chat: Architecture, Data Flow & Storage
·1318 words·7 mins· loading
Artificial Intelligence Developer Tools Software Architecture Cursor IDE Cursor Chat AI Code Editor SQLite Turbopuffer Codebase Indexing RAG Semantic Search Data Flow Local Storage Composer
Cursor Chat: Architecture, Data Flow & Storage # This document explains how Cursor chat works …
Safeguarding PII When Using LLMs in Alternative Investment Banking
·4261 words·21 mins· loading
Artificial Intelligence Financial Technology Data Security & Privacy PII Protection LLM Privacy Alternative Investment Banking BFSI Data Privacy AI Compliance Differential Privacy Federated Learning Financial AI Security
Safeguarding PII When Using LLMs in Alternative Investment Banking # 1. Introduction # The …