AIBearisharXiv โ CS AI ยท 2d ago6/10
๐ง
Why LLMs Fail: A Failure Analysis and Partial Success Measurement for Automated Security Patch Generation
A research study analyzing 319 LLM-generated security patches found that only 24.8% achieve full correctness, with most failures due to semantic misunderstanding rather than syntax errors. LLMs preserve functionality well but struggle significantly with security fixes, with success rates varying dramatically by vulnerability type.