AIBullisharXiv – CS AI · Jun 27/10
🧠Researchers present a self-healing orchestration framework for tool-augmented large language models that treats reliability as a bounded runtime control problem, achieving 98.8% task success by mapping failure signals to recovery actions and verifying results. The approach outperforms retry-only and full-replanning baselines across multiple benchmarks, particularly excelling when recovery budgets are constrained.
AIBearisharXiv – CS AI · May 287/10
🧠Researchers introduce HARP, a methodology for measuring how harm propagates across multi-agent LLM systems when one component is compromised. Testing on a finance-oriented seven-agent system reveals that single-agent compromise creates the strongest amplification effects, while existing defenses struggle to balance security with utility costs.
AIBullisharXiv – CS AI · Apr 147/10
🧠Researchers introduce Context Kubernetes, an architecture that applies container orchestration principles to managing enterprise knowledge in AI agent systems. The system addresses critical governance, freshness, and security challenges, demonstrating that without proper controls, AI agents leak data in over 26% of queries and serve stale content silently.
AIBullishMarkTechPost · Mar 107/10
🧠ByteDance has released DeerFlow 2.0, an open-source SuperAgent framework that orchestrates sub-agents, memory, and sandboxes to execute complex tasks autonomously. This represents a significant evolution from current AI assistants that primarily suggest actions to systems that can actually perform them.
🏢 Microsoft
AIBullishOpenAI News · Feb 277/105
🧠Amazon Bedrock introduces a new Stateful Runtime Environment for AI agents that provides persistent orchestration, memory capabilities, and secure execution for complex multi-step AI workflows. The service leverages OpenAI technology to enable more sophisticated AI agent operations with maintained state across interactions.
AINeutralarXiv – CS AI · 5d ago6/10
🧠A new arXiv paper analyzes the sources of variability in agentic AI systems, distinguishing between token-sampling randomness intrinsic to foundation models and external factors like environmental changes and infrastructure effects. The research clarifies when AI agent outputs are genuinely stochastic versus reproducible, with implications for understanding AI reliability in production deployments.
AINeutralarXiv – CS AI · 6d ago6/10
🧠Researchers compare three orchestration approaches for AI agents handling customer-service workflows: declarative agents using natural-language skill files, imperative agents with programmatic state machines, and unscaffolded baseline agents. The study finds that retrieval quality is the dominant bottleneck, and declarative skills improve performance on procedural tasks only when evidence quality is high.
AIBullisharXiv – CS AI · May 286/10
🧠AgensFlow is an open-source framework that treats multi-agent LLM coordination as a learnable policy problem rather than a fixed pipeline, enabling dynamic routing decisions across skill protocols, agent roles, and model bindings. Evaluated on distributed systems and security tasks, the framework demonstrates that learned coordination outperforms static designs while reducing exploration costs through warm-started policy graphs.
AINeutralarXiv – CS AI · May 126/10
🧠The CODS 2025 AssetOpsBench competition retrospective reveals critical gaps between public and private evaluation metrics in multi-agent orchestration systems. Hidden test sets dramatically altered performance rankings, particularly in execution tasks where correlations turned negative, while successful teams prioritized guardrails over novel architectures.
AIBullisharXiv – CS AI · Feb 276/106
🧠Researchers have introduced ESAA (Event Sourcing for Autonomous Agents), a new architecture that improves LLM-based autonomous agents by separating cognitive intention from state mutation using structured JSON events and deterministic orchestration. The system addresses key limitations like context degradation and execution reliability, with successful validation through multi-agent case studies using various LLMs including Claude Sonnet and GPT-5.
AINeutralOpenAI News · Jan 235/104
🧠This article provides a technical deep dive into the Codex agent loop architecture, detailing how the Codex CLI system orchestrates AI models, tools, prompts, and performance monitoring through the Responses API. The analysis focuses on the technical implementation and workflow of the Codex agent system.