🧠 AI🔴 BearishImportance 6/10

There's a Benchmark Test That Measures AI 'Bullshit'—Most Models Fail

Decrypt|Jose Antonio Lanz|March 10, 2026 at 07:26 PM

There's a Benchmark Test That Measures AI 'Bullshit'—Most Models Fail — image 2

2 images via Decrypt

🤖AI Summary

BullshitBench, a new benchmark test, evaluates AI models' ability to detect nonsensical questions versus confidently providing incorrect answers. The results show most AI models fail this test, highlighting a significant reliability issue in current AI systems.

Key Takeaways

→BullshitBench is a new benchmark specifically designed to test AI models' ability to identify nonsensical questions.
→Most current AI models fail the test by confidently answering questions that don't make sense.
→This reveals a critical flaw in AI systems' ability to recognize when they should refuse to answer.
→The results highlight ongoing reliability and trustworthiness issues in AI technology.
→This benchmark could become an important metric for evaluating AI model quality and safety.