6 articles tagged with #social-reasoning. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.
AIBullisharXiv β CS AI Β· Apr 137/10
π§ Researchers introduce a hybrid framework combining probabilistic models with large language models to improve social reasoning in AI agents, achieving a 67% win rate against human players in the game Avalonβa breakthrough in AI's ability to infer beliefs and intentions from incomplete information.
AIBullisharXiv β CS AI Β· 6d ago6/10
π§ Researchers introduce CoSToM, a framework that uses causal tracing and activation steering to improve Theory of Mind alignment in large language models. The work addresses a critical gap between LLMs' internal knowledge and external behavior, demonstrating that targeted interventions in specific neural layers can enhance social reasoning capabilities and dialogue quality.
AIBullisharXiv β CS AI Β· Mar 116/10
π§ Researchers introduce Social-R1, a reinforcement learning framework that enhances social reasoning in large language models by training on adversarial examples. The approach enables a 4B parameter model to outperform larger models across eight benchmarks by supervising the entire reasoning process rather than just outcomes.
AIBearisharXiv β CS AI Β· Mar 36/104
π§ Researchers introduced SimpleToM, a benchmark revealing that state-of-the-art language models can infer mental states but struggle to apply that knowledge for behavior prediction and judgment. The study exposes a critical gap between explicit Theory of Mind inference and implicit application in real-world scenarios.
AINeutralarXiv β CS AI Β· Apr 64/10
π§ Research reveals that large language models can reproduce the qualitative structure of human social reasoning but struggle with quantitative magnitude calibration. Pragmatic prompting strategies that consider speaker knowledge and motives can improve this calibration, though fine-grained accuracy remains partially unresolved.
AINeutralarXiv β CS AI Β· Mar 54/10
π§ Researchers evaluated five Multimodal Large Language Models (MLLMs) on their ability to reason about social norms in both text and image scenarios. GPT-4o performed best overall, while all models showed superior performance with text-based norm reasoning compared to image-based scenarios.
π§ GPT-4