y0news
AnalyticsDigestsSourcesTopicsRSSAICrypto

#role-playing-agents News & Analysis

4 articles tagged with #role-playing-agents. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

4 articles
AINeutralarXiv – CS AI · 2d ago6/10
🧠

ArcANE: Do Role-Playing Language Agents Stay in Character at the Right Time?

Researchers introduce ArcANE, a benchmark for evaluating whether role-playing language agents maintain character consistency across narrative arcs rather than fixed personas. The benchmark spans 17 novels and 80 characters, revealing that conditioning on character arc information significantly improves model performance, especially for scenarios outside source texts.

AINeutralarXiv – CS AI · 5d ago6/10
🧠

RoleCDE:Benchmarking and Mitigating Role-Alignment Trade-offs in Role-Playing Agents

Researchers introduce RoleCDE, a benchmark for evaluating role-playing agents in large language models, revealing a 'Role Value Decoupling' phenomenon where LLMs default to alignment-oriented decisions over role-specific values when conflicts arise. Fine-tuning with RoleCDE data effectively mitigates this behavior while preserving general performance.

AINeutralarXiv – CS AI · Apr 146/10
🧠

RPA-Check: A Multi-Stage Automated Framework for Evaluating Dynamic LLM-based Role-Playing Agents

RPA-Check introduces an automated four-stage framework for evaluating Large Language Model-based Role-Playing Agents in complex scenarios, addressing the gap in standard NLP metrics for assessing role adherence and narrative consistency. Testing across legal scenarios reveals that smaller, instruction-tuned models (8-9B parameters) outperform larger models in procedural consistency, suggesting optimal performance doesn't correlate with model scale.

AINeutralarXiv – CS AI · Mar 54/10
🧠

Rethinking Role-Playing Evaluation: Anonymous Benchmarking and a Systematic Study of Personality Effects

Researchers propose an anonymous evaluation method for Role-Playing Agents (RPAs) built on large language models, revealing that current benchmarks are biased by character name recognition. The study shows that incorporating personality traits, whether human-annotated or self-generated by AI models, significantly improves role-playing performance under anonymous conditions.