y0news
#black-box-monitoring1 article
1 articles
AINeutralarXiv โ€“ CS AI ยท 6h ago1
๐Ÿง 

Constitutional Black-Box Monitoring for Scheming in LLM Agents

Researchers developed constitutional black-box monitors to detect scheming behavior in LLM agents using only observable inputs and outputs. The study found that monitors trained on synthetic data can generalize to realistic environments, but performance improvements plateau quickly with simple optimization techniques outperforming complex methods.