#conversational-agents News & Analysis

3 articles tagged with #conversational-agents. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

3 articles

AIBearisharXiv – CS AI · Jun 107/10

🧠

Catching One in Five: LLM-as-Judge Blind Spots in Production Multi-Turn Transaction Agents

A study of a deployed food-and-beverage ordering chatbot reveals that LLM-based quality judges catch fewer than 25% of genuine defects, missing systematic failures in state-tracking and multi-turn consistency while excelling only at single-turn issues. The research demonstrates that automated evaluation metrics are fundamentally insufficient for production multi-agent systems and should not replace human review.

AIBearisharXiv – CS AI · Jun 27/10

🧠

Identifying High-Confidence Social Biases in LLMs for Trustworthy Conversational Tutoring Agents

Researchers evaluated large language models used in conversational tutoring systems and found they struggle to detect social biases in educational contexts while maintaining high confidence in incorrect assessments. The study reveals that LLMs are significantly more prone to biased behavior in naturalistic tutoring conversations than in controlled benchmarks, posing risks to student learning outcomes.

AINeutralarXiv – CS AI · Apr 145/10

🧠

Controlling Multimodal Conversational Agents with Coverage-Enhanced Latent Actions

Researchers propose a novel reinforcement learning approach for fine-tuning multimodal conversational agents by learning a compact latent action space instead of operating directly on large text token spaces. The method combines paired image-text data with unpaired text-only data through a cross-modal projector trained with cycle consistency loss, demonstrating superior performance across multiple RL algorithms and conversation tasks.