#multi-turn-agents News & Analysis

6 articles tagged with #multi-turn-agents. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

6 articles

AIBullisharXiv – CS AI · Jun 107/10

🧠

IntentKV: Cross-Turn Intent-Aware KV Cache Pruning for Agent Inference

Researchers introduce IntentKV, a learned KV cache pruning technique that optimizes memory usage for multi-turn LLM agents without modifying the base model. The method achieves 23-30% reductions in peak request tokens and up to 92.6% fewer KV reads under tight memory budgets, addressing a critical bottleneck in long-horizon agent inference.

AIBearisharXiv – CS AI · Jun 107/10

🧠

Catching One in Five: LLM-as-Judge Blind Spots in Production Multi-Turn Transaction Agents

A study of a deployed food-and-beverage ordering chatbot reveals that LLM-based quality judges catch fewer than 25% of genuine defects, missing systematic failures in state-tracking and multi-turn consistency while excelling only at single-turn issues. The research demonstrates that automated evaluation metrics are fundamentally insufficient for production multi-agent systems and should not replace human review.

AIBullisharXiv – CS AI · May 287/10

🧠

ZipRL: Adaptive Multi-Turn Context Compression with Hindsight Response Replay

Researchers introduce ZipRL, an adaptive context compression framework that uses reinforcement learning to efficiently reduce token usage in multi-turn LLM agent tasks while preserving task-critical information. The method incorporates Hindsight Response Replay to address sparse reward problems and demonstrates 27-35% performance improvements over existing approaches on benchmark tasks.

AINeutralarXiv – CS AI · Jun 236/10

🧠

AgentLens: Interpretable Safety Steering via Mechanistic Subspaces for Multi-Turn Coding Agent

Researchers introduce AgentLens, a white-box defense framework that detects and mitigates safety risks in multi-turn LLM coding agents by intervening in mechanistic subspaces. The framework achieves strong safety detection performance through step-level hidden representation analysis, addressing the limitations of external guardrails in capturing evolving execution risks.

AINeutralarXiv – CS AI · Jun 236/10

🧠

Self-Evolution for Multi-Turn Tool-Calling Agents via Divergence-Point Preference Learning

Researchers present ToolGraph, a framework that improves multi-turn tool-using AI agents through self-evolution via preference learning. By combining schema-derived topology with divergence-point preference optimization, the system achieves 16.8% improvement over baseline performance on benchmark tasks, with gains concentrated in airline and retail domains.

AINeutralarXiv – CS AI · May 276/10

🧠

StepOPSD: Step-Aware Online Preference Distillation for Agent Reinforcement Learning

StepOPSD introduces a novel reinforcement learning framework that improves credit assignment in multi-turn agent tasks by treating individual steps rather than entire trajectories as the unit of learning. The method achieves state-of-the-art results on benchmark tasks like ALFWorld and Search-QA, demonstrating that step-level preference distillation is particularly effective when trajectory rewards poorly correlate with individual decision quality.