AgentPSO: Evolving Agent Reasoning Skill via Multi-agent Particle Swarm Optimization
Researchers introduce AgentPSO, a framework that evolves multi-agent reasoning skills in large language models using particle swarm optimization principles. Rather than relying on static agents or inference-time debate, the system enables agents to iteratively improve their reasoning capabilities through self-reflection and collective learning, demonstrating improved performance and cross-benchmark transferability without modifying underlying model parameters.
AgentPSO addresses a fundamental limitation in current multi-agent AI systems: the agents themselves don't improve over time. While existing approaches treat agents as static entities that debate at inference time, this research introduces a dynamic evolution mechanism where agents function like particles in a swarm, continuously refining their reasoning strategies. The framework eschews traditional parameter updating, instead treating agent skills as natural-language states that move through semantic space guided by personal and collective experience.
The approach builds on decades of particle swarm optimization research applied to a novel domain: LLM reasoning. By incorporating self-reflective directions derived from peer trajectories, agents learn not just from their own failures and successes but from the entire population's discoveries. This mirrors successful patterns in evolutionary algorithms and multi-agent reinforcement learning, but preserves interpretability through natural language skills rather than opaque parameter weights.
The research carries implications for AI developers and researchers working on reasoning systems. The ability to evolve reasoning skills without model retraining reduces computational costs while potentially improving generalization. The demonstrated transferability across benchmarks and backbone models suggests the system captures fundamental reasoning procedures rather than exploiting benchmark-specific patterns, addressing a persistent concern in AI evaluation.
Future developments likely include scaling AgentPSO to larger agent populations, applying it to domain-specific reasoning tasks, and exploring how evolved skills transfer to entirely new model architectures. The open-source release enables rapid community iteration and validation across diverse use cases.
- βAgentPSO enables agents to evolve reasoning skills iteratively without modifying underlying language model parameters, reducing computational overhead.
- βThe framework combines personal best-performance with global population discovery, allowing agents to learn from both individual and collective experience.
- βEvolved reasoning skills demonstrate cross-benchmark and cross-model transferability, suggesting capture of generalizable reasoning procedures.
- βMulti-agent particle swarm approaches offer an alternative to inference-time debate by embedding improvement into the training loop itself.
- βOpen-source release positions AgentPSO as a foundation for further research in adaptive multi-agent AI systems.