AINeutralarXiv – CS AI · 6h ago6/10
🧠
Knowledge-Level Consistency Reinforcement Learning: Dual-Fact Alignment for Long-Form Factuality
Researchers propose KLCF, a reinforcement learning framework designed to reduce hallucinations in large language models during long-form text generation by aligning a policy model's knowledge distribution with its base model's parametric knowledge. The approach uses a Dual-Fact Alignment mechanism with factual checklists and truthfulness rewards, demonstrating consistent improvements across benchmarks without requiring external retrieval.