AIBearisharXiv – CS AI · 6h ago7/10
🧠
ABC-Bench: An Agentic Bio-Capabilities Benchmark for Biosecurity
Researchers introduced ABC-Bench, a benchmark testing LLM agents on biosecurity-relevant tasks including DNA design and synthesis screening evasion. All tested AI agents outperformed human expert baselines, with OpenAI's o4-mini-high successfully generating functional wet-lab scripts, raising urgent questions about AI capabilities in dual-use biological research.
🏢 OpenAI