y0news
AnalyticsDigestsSourcesTopicsRSSAICrypto

#multi-turn-reasoning News & Analysis

5 articles tagged with #multi-turn-reasoning. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

5 articles
AIBullisharXiv – CS AI · May 117/10
🧠

MedAction: Towards Active Multi-turn Clinical Diagnostic LLMs

Researchers introduce MedAction, a new framework and dataset designed to improve how large language models perform clinical diagnosis by simulating real-world multi-turn diagnostic processes. The approach addresses fundamental limitations in current medical LLMs through a tree-structured distillation pipeline that generates high-quality diagnostic trajectories, achieving state-of-the-art performance among open-source models.

AIBullisharXiv – CS AI · Apr 147/10
🧠

UniToolCall: Unifying Tool-Use Representation, Data, and Evaluation for LLM Agents

UniToolCall introduces a standardized framework unifying tool-use representation, training data, and evaluation for LLM agents. The framework combines 22k+ tools and 390k+ training instances with a unified evaluation methodology, enabling fine-tuned models like Qwen3-8B to achieve 93% precision—surpassing GPT, Gemini, and Claude in specific benchmarks.

🧠 Claude🧠 Gemini
AIBullisharXiv – CS AI · 3d ago6/10
🧠

Same Evidence, Different Answers: Canonical-Context On-Policy Distillation for Multi-Turn Language Models

Researchers propose Canonical-Context On-Policy Distillation (CCOPD), a training method that improves large language models' ability to solve problems when information is revealed incrementally across multiple conversation turns rather than all at once. By using a frozen teacher model with complete context to guide a student model receiving fragmented information, CCOPD achieves 32% relative performance improvement on multi-turn tasks while maintaining single-prompt performance.

AINeutralarXiv – CS AI · 4d ago6/10
🧠

Do Agents Think Deeper? A Mechanistic Investigation of Layer-Wise Dynamics in Sequential Planning

Researchers conducted a mechanistic analysis of how large language models allocate computational depth when operating as autonomous agents performing multi-turn planning and tool use. The study reveals that agents progressively recruit deeper layers as task complexity increases, contrasting with prior findings that LLMs underutilize depth in single-turn tasks, suggesting adaptive depth allocation emerges in sequential reasoning scenarios.

AIBullisharXiv – CS AI · May 116/10
🧠

MemSearcher: Training LLMs to Reason, Search and Manage Memory via End-to-End Reinforcement Learning

Researchers introduce MemSearcher, an AI agent framework that optimizes how large language models handle multi-turn interactions by maintaining compact memory instead of concatenating full conversation history. The approach uses a novel multi-context GRPO training method and demonstrates superior performance while maintaining stable token counts, reducing computational overhead.