ChatGLM is a series of open-source bilingual dialogue language models (Chinese and English) based on the General Language Model (GLM) architecture.
ChatGLM-6B
- Parameters: 6.2 billion
- Features:
- Supports Chinese and English bilingual Q&A.
- Uses quantization technology, requiring only 6GB of GPU memory at INT4 quantization level.
- Trained on approximately 1T Chinese and English tokens, supplemented by supervised fine-tuning, self-feedback, and reinforcement learning from human feedback.
- Provides smooth dialogue with low deployment barriers.
ChatGLM2-6B
- Parameters: 6.2 billion
- Features:
- Developed based on the experience from the first-generation model, with comprehensive base model upgrades.
- Trained on 1.4T Chinese and English tokens using a hybrid objective function from GLM, aligned with human preferences.
- Significant performance improvements across multiple datasets (e.g., MMLU, CEval, GSM8K, BBH).
- Extended context length to 32K, allowing more rounds of dialogue.
- Inference speed increased by 42%, and at INT4 quantization, the context length supported by 6GB of GPU memory increased from 1K to 8K.
- Model weights are fully open for academic research and allow free commercial use.
ChatGLM3-6B
- Parameters: 6.2 billion
- Features:
- Jointly released by Zhipu AI and the Knowledge Engineering Lab at Tsinghua University.
- Retains the conversational fluency and low deployment threshold of the previous two generations.
- Introduces new features and improvements, further enhancing dialogue performance and user experience.
GLM-4 Series
- Models Include: GLM-4, GLM-4-Air, GLM-4-9B, etc.
- Features:
- Pretrained on 10 trillion tokens, primarily bilingual (Chinese and English), with small-scale corpora in 24 other languages.
- Achieves high-quality alignment through multi-stage post-training, including supervised fine-tuning and human feedback learning.
- Performs exceptionally well across multiple benchmarks, such as MMLU, GSM8K, MATH, BBH, GPQA, and HumanEval, matching or surpassing GPT-4.
- GLM-4 All Tools model is aligned to understand user intent and autonomously decide which tools to use (e.g., web browsers, Python interpreters, text-to-image models) to complete complex tasks.
Applications
Smart Customer Service
ChatGLM can be used in enterprise intelligent customer service systems to improve efficiency and reduce labor costs while enhancing customer satisfaction. Specific applications include:
- Customer Inquiries: Responds in real-time to common customer questions, providing product information and usage guidance.
- Complaint Handling: Automatically records and categorizes complaints, providing initial solutions or transferring to human customer service.
Education
In the education sector, ChatGLM can be used to create interactive learning tools, answer student questions, and offer study suggestions and resource recommendations. Specific applications include:
- Online Tutoring: Provides personalized tutoring to students, answering academic questions.
- Content Generation: Automatically generates teaching materials, exercises, and answer explanations.
Healthcare
In healthcare, ChatGLM can assist doctors and patients with initial health consultations and diagnostic advice, improving the efficiency and quality of medical services. Specific applications include:
- Health Consultations: Answers common health questions from patients and provides prevention and wellness suggestions.
- Condition Tracking: Helps doctors record and analyze patient condition changes, providing personalized treatment suggestions.
Finance
In the financial sector, ChatGLM can be applied to customer service, risk assessment, and market analysis, enhancing the intelligence level of financial services. Specific applications include:
- Investment Advice: Provides real-time market analysis and investment suggestions.
- Risk Management: Automatically identifies and evaluates potential financial risks, offering warnings and countermeasures.
Content Generation
ChatGLM can generate various types of text content, such as news reports, blog posts, and product descriptions, helping content creators improve efficiency. Specific applications include:
- News Generation: Automatically generates news reports based on input keywords or events.
- Marketing Copy: Creates engaging product descriptions and advertisement copy.
Language Translation
ChatGLM supports bilingual Chinese-English processing and can be used for language translation and cross-language communication, helping users overcome language barriers. Specific applications include:
- Real-Time Translation: Provides instant text translation services, supporting translation between multiple languages.
- Cross-Language Dialogue: Facilitates real-time conversations across different languages.
Programming Assistance
ChatGLM can also assist in programming by helping developers generate code, debug programs, and solve coding problems. Specific applications include:
- Code Generation: Generates code snippets based on descriptions.
- Error Troubleshooting: Analyzes code errors and provides suggestions for fixes.
Open-Source Versions
ChatGLM-6B
- Parameters: 6.2 billion
- Features:
- Bilingual Support: ChatGLM-6B was trained on 1T tokens from Chinese and English corpora at a 1:1 ratio, enabling bilingual capabilities.
- Low Deployment Barriers: ChatGLM-6B can run on just 6GB of GPU memory using INT4 quantization, making it easy for individual users and small businesses to deploy.
- Efficient Fine-Tuning: Provides an efficient parameter fine-tuning method based on P-Tuning v2, allowing developers to customize the model for specific use cases.
- Open Source License: The model weights are fully open for academic research and allow free commercial use, promoting wide technological adoption and development.
ChatGLM2-6B
- Parameters: 6.2 billion
- Features:
- Performance Enhancement: ChatGLM2-6B builds on the first-generation model, using a hybrid objective function from GLM and trained with 1.4T Chinese and English tokens aligned with human preferences. It shows significant performance improvements across multiple datasets.
- Extended Context Length: With FlashAttention technology, the context length is extended from 2K to 32K, and trained with 8K context length during dialogue, allowing for more extended conversations.
- More Efficient Inference: With Multi-Query Attention, inference speed is increased by 42%, and INT4 quantization allows for a conversation length increase from 1K to 8K tokens on 6GB of GPU memory.
- Open License: ChatGLM2-6B's weights are fully open for academic research and allow free commercial use.
ChatGLM3-6B
- Parameters: 6.2 billion
- Features:
- Base Model Upgrades: ChatGLM3-6B uses more diverse training data, more training steps, and optimized training strategies, excelling across datasets on semantics, math, reasoning, code, and knowledge.
- Functionality Support: ChatGLM3-6B introduces a newly designed prompt format, natively supporting tool invocation (Function Call), code execution (Code Interpreter), and agent tasks for complex scenarios.
- Open-Source Sequence: In addition to the dialogue model ChatGLM3-6B, the base model ChatGLM-6B-Base and long-text dialogue model ChatGLM3-6B-32K are also open-sourced.