AINeutralarXiv โ CS AI ยท 5h ago
๐ง
Build, Judge, Optimize: A Blueprint for Continuous Improvement of Multi-Agent Consumer Assistants
Researchers present a blueprint for evaluating and optimizing multi-agent conversational shopping assistants, addressing challenges in multi-turn interactions and tightly coupled AI systems. The paper introduces evaluation rubrics and two prompt-optimization strategies including a novel Multi-Agent Multi-Turn GEPA approach for system-level optimization.