Newsletter
Subscribe online
Subscribe to our newsletter for the latest news and updates
Qwen3 is an open-source large language model released by Alibaba, introducing a hybrid reasoning mode that allows users to choose between “thinking” or “non-thinking” modes based on task requirements.
HiDream-E1 is an interactive image editing model built upon HiDream-I1, designed to simplify the image editing process through natural language processing technology.
Qwen3 is an open-source large language model released by Alibaba, introducing a hybrid reasoning mode that allows users to choose between “thinking” or “non-thinking” modes based on task requirements.
Mixture-of-Experts (MoE) Architecture:
Qwen3 adopts a MoE architecture that activates only a subset of parameters during inference, enhancing computational efficiency. Specifically, the Qwen3-235B-A22B model contains 235 billion parameters in total, but only 2.2 billion are active during inference. This design enables the model to maintain high performance while significantly reducing computational cost.
Multiple Model Versions:
The Qwen3 series includes various versions to accommodate different parameter scales and functional needs:
Thinking vs. Non-Thinking Modes:
Qwen3 introduces two reasoning modes to adapt to different task complexities:
Language Capability:
Qwen3 supports over 119 languages and dialects, offering robust instruction-following and translation capabilities, making it suitable for global applications.
Inference Capability:
Qwen3 excels in benchmark tests, especially in mathematics, code generation, and commonsense reasoning—outperforming many mainstream models such as DeepSeek-R1 and OpenAI's o1. Its performance in complex tasks makes it a leading open-source model.
Integration and Tool Utilization:
Qwen3 has strong tool-calling abilities, allowing seamless integration with external tools to handle complex multi-step operations—ideal for developing intelligent assistants and automation tasks.
Pretraining Dataset:
Qwen3 was trained on approximately 360 trillion tokens, vastly improving its knowledge coverage and reasoning skills. Its training process was optimized in multiple phases to ensure high performance across diverse tasks.