Newsletter
Subscribe online
Subscribe to our newsletter for the latest news and updates
Aya Vision is a set of advanced vision-language models designed to address multilingual performance challenges in multimodal AI systems.
Jamba 1.6: A New Open Model by AI21 Labs for Efficient Enterprise AI Solutions
Aya Vision: A Series of Advanced Vision-Language Models (VLMs) by Cohere For AI
Aya Vision is a set of advanced vision-language models designed to address multilingual performance challenges in multimodal AI systems.
Aya Vision supports 23 languages, including English, French, German, Spanish, Italian, and Portuguese. This broad language support makes it highly applicable worldwide, particularly for businesses and organizations operating in multilingual markets.
The model can perform a variety of tasks, including image captioning, answering questions about photos, translating text, and generating summaries. This multimodal capability makes Aya Vision highly valuable in areas such as education, cultural preservation, and accessibility tools.
Cohere For AI is committed to open science and has released Aya Vision’s open weights on Kaggle and Hugging Face, allowing researchers worldwide to access and experiment with these models. This openness fosters collaboration and knowledge sharing in AI research.
Aya Vision is trained using synthetic annotations, a method that leverages AI-generated data labels to enhance model training. This approach is particularly useful in situations where data availability is limited, improving the model’s performance and adaptability.
Cohere has introduced the Aya Vision Benchmark, a new multilingual vision evaluation dataset designed to provide a rigorous assessment framework for multimodal AI. This benchmark helps researchers better understand and improve the performance of vision-language models.
Cohere recently launched the Aya Vision AI models as open-source, supporting multiple languages and multimodal functionalities. The model comes in two versions:
Both versions are available on Hugging Face under the Creative Commons 4.0 license, promoting community-driven innovation and research.