Tag

#vision-language-action

4 articles

Robbyant Releases LingBot-VLA 2.0: An Open-Source 6B Vision-Language-Action (VLA) Model for Cross-Embodiment Robot Manipulation

Learn how LingBot-VLA 2.0 is an advanced AI model that helps robots understand vision, language, and actions, allowing them to work with many different robot types.

Jul 826

Meet Qwen-RobotSuite: Three Embodied AI Models for VLA Manipulation, Video World Modeling, and Navigation

Learn about Qwen-RobotSuite, a new system of three AI models that help robots understand and interact with the physical world through manipulation, video understanding, and navigation.

Jun 1632

How to Build a Lightweight Vision-Language-Action-Inspired Embodied Agent with Latent World Modeling and Model Predictive Control

Build a lightweight vision-language-action-inspired embodied agent that learns to perceive, plan, predict, and replan directly from pixel observations in a grid world environment.

Apr 2867

Physical Intelligence Team Unveils MEM for Robots: A Multi-Scale Memory System Giving Gemma 3-4B VLAs 15-Minute Context for Complex Tasks

This explainer explores MEM, a multi-scale memory system that extends the context window of Vision-Language-Action (VLA) models to 15 minutes, enabling robots to perform complex, multi-step tasks.

Mar 3153