AINeutralarXiv – CS AI · 9h ago6/10
🧠
TEA-Bench: A Systematic Benchmarking of Tool-enhanced Emotional Support Dialogue Agent
Researchers introduce TEA-Bench, the first interactive benchmark for evaluating how external tools improve emotional support conversation (ESC) systems. Testing nine LLMs reveals that tool augmentation reduces hallucination and improves support quality, but effectiveness depends heavily on model capacity—stronger models leverage tools more effectively than weaker ones.