Back to Glossary
Vision Language Models (VLMs)
What are vision language models (VLMs)?
Vision Language Models (VLMs) are a type of AI model that combine visual input with language processing, accelerating a "new era of robotics" by enabling robots to understand natural language instructions in visual contexts.