AIBullishMicrosoft Research Blog · 5h ago6/10
🧠
AsgardBench: A benchmark for visually grounded interactive planning
Microsoft Research introduces AsgardBench, a new benchmark for evaluating embodied AI systems that can perform visually grounded interactive planning. The benchmark focuses on testing robots' ability to observe environments, make decisions, and adapt when conditions change unexpectedly, using kitchen cleaning scenarios as examples.