AIBearisharXiv โ CS AI ยท 5h ago7/10
๐ง
Red Teaming Large Reasoning Models
Researchers introduce RT-LRM, a comprehensive benchmark for evaluating the trustworthiness of Large Reasoning Models across truthfulness, safety, and efficiency dimensions. The study reveals that LRMs face significant vulnerabilities including CoT-hijacking and prompt-induced inefficiencies, demonstrating they are more fragile than traditional LLMs when exposed to reasoning-induced risks.