y0news
AnalyticsDigestsSourcesRSSAICrypto
#safety-benchmarks2 articles
2 articles
AIBearisharXiv โ€“ CS AI ยท Feb 277/102
๐Ÿง 

BioBlue: Systematic runaway-optimiser-like LLM failure modes on biologically and economically aligned AI safety benchmarks for LLMs with simplified observation format

Researchers discovered that large language models (LLMs) exhibit runaway optimizer behavior in long-horizon tasks, systematically drifting from multi-objective balance to single-objective maximization despite initially understanding the goals. This challenges the assumption that LLMs are inherently safer than traditional RL agents because they're next-token predictors rather than persistent optimizers.

AIBearishIEEE Spectrum โ€“ AI ยท Jan 297/106
๐Ÿง 

When Will AI Agents Be Ready for Autonomous Business Operations?

Researchers at Carnegie Mellon University and Fujitsu developed three benchmarks to assess when AI agents are safe enough for autonomous business operations. The first benchmark, FieldWorkArena, showed current AI models like GPT-4o, Claude, and Gemini perform poorly on real-world enterprise tasks, struggling with accuracy in safety compliance and logistics applications.