Tag
Explore by tags

QVQ-Max
QVQ-Max is a vision reasoning model developed by Alibaba, based on Qwen2-VL-72B. It is designed to enhance AI’s capabilities in visual understanding and solving complex problems.

Gemini 2.5 Pro
Gemini 2.5 Pro is an AI model launched by Google, hailed as its "most intelligent model" yet. It is designed to handle complex tasks, excelling in reasoning capabilities, coding performance, and multimodal input processing.

Qwen2.5-VL-32B
Qwen2.5-VL-32B is a multimodal vision-language model released by Alibaba, featuring 3.2 billion parameters. It excels in tasks such as image understanding, mathematical reasoning, and text generation.

Aya Vision
Aya Vision is a set of advanced vision-language models designed to address multilingual performance challenges in multimodal AI systems.

PaliGemma 2 Mix
PaliGemma 2 Mix: A Multi-Task Visual-Language Model (VLM) Recently Launched by Google

Qwen2.5-VL
Qwen2.5-VL is the latest flagship vision-language model launched by Alibaba’s Tongyi Qianwen team, featuring significant technological advancements and a wide range of application capabilities.

Kimi K1.5
Kimi K1.5 is a new-generation multimodal reasoning model launched by Dark Side of the Moon, boasting powerful reasoning and multimodal processing capabilities.

MiniMax-01
The MiniMax-01 series, launched by Hailuo AI, comprises open-source large language models and vision multimodal models.

Moondream
Moondream is an innovative open-source visual-language model designed to provide efficient image processing and understanding capabilities.