AIBullisharXiv – CS AI · 6h ago6/10
🧠
LLM-Based Code Documentation Generation and Multi-Judge Evaluation
Researchers developed an AI framework using eight large language models to automatically generate high-quality source code documentation, with a novel multi-LLM evaluation system assessing outputs across nine quality criteria. Testing on a medical physics library revealed a 42% performance gap between top and bottom models, demonstrating the framework's effectiveness in reducing manual documentation effort for safety-critical software.
🧠 Gemini