AI ร CryptoNeutralarXiv โ CS AI ยท 4h ago7/10
๐ค
CREBench: Evaluating Large Language Models in Cryptographic Binary Reverse Engineering
Researchers introduced CREBench, a benchmark to evaluate large language models' capabilities in cryptographic binary reverse engineering. The best-performing model (GPT-5.4) achieved 64.03% success rate, while human experts scored 92.19%, showing AI still lags behind human expertise in cryptographic analysis tasks.
๐ง GPT-5