AINeutralarXiv – CS AI · 10h ago6/10
🧠
Do not copy and paste! Rewriting strategies for code retrieval
Researchers evaluated multiple code retrieval strategies using LLM-based rewriting, finding that full natural language transcription with query-corpus augmentation achieves the largest gains but corpus-only approaches often degrade performance. They introduced Delta H (token entropy) as a cheap, rewriter-agnostic metric to predict when LLM rewriting justifies its computational cost.