Category
Explore by categories
PlayDialog
PlayDialog is an advanced AI voice model designed to provide a smooth and expressive conversational experience.
Qwen2.5-Coder
Qwen2.5-Coder is the latest open-source model in Alibaba's Qwen series, focused on tasks such as code generation, inference, and repair.
CogSound
CogSound is a sound effects generation model developed by Zhipu Technology, designed to create sound effects that match the visual content of AI-generated videos. It integrates closely with the latest video generation model, CogVideoX v1.5, which has achieved significant improvements in video generation capabilities.
CogVideoX v1.5
CogVideoX v1.5 is the latest open-source video generation model developed by the Zhipu AI team, designed to enhance image-to-video (I2V) generation quality and capabilities.
Hunyuan-Large
Hunyuan-Large is Tencent’s recently open-sourced, large-scale Mixture of Experts (MoE) model, featuring 3.89 trillion total parameters and 52 billion active parameters.
Hunyuan3D
Hunyuan3D-1.0 is a recently open-sourced, high-efficiency 3D generation model from Tencent, supporting both Text-to-3D and Image-to-3D generation.
SmolLM2
SmolLM2 is a series of compact language models recently released by Hugging Face, designed specifically for on-device applications.
MobileLLM
MobileLLM is a highly efficient language model launched by Meta, specifically designed for mobile devices and resource-constrained environments.
Sonic
Sonic is a low-latency voice generation model developed by Cartesia AI, designed to provide real-time conversational AI solutions.
PixVerse V3
PixVerse V3: The Latest AI Video Generation Model Enhancing User Creativity
Stable Diffusion 3.5
Stable Diffusion 3.5 marks a significant advancement in the series, showcasing enhanced capabilities for professional-level image generation.
GLM-4-Voice
GLM-4-Voice is an advanced end-to-end speech model developed by Zhipu AI, designed to facilitate real-time speech interaction in both Chinese and English. This model features multiple advanced capabilities, including the ability to understand and generate speech, while adjusting emotional tone, pitch, speed, and accent based on user instructions.