AINeutralarXiv โ CS AI ยท 7h ago6/10
๐ง
Hallucination as output-boundary misclassification: a composite abstention architecture for language models
Researchers propose a composite architecture combining instruction-based refusal with a structural abstention gate to reduce hallucinations in large language models. The system uses a support deficit score derived from self-consistency, paraphrase stability, and citation coverage to block unreliable outputs, achieving better accuracy than either mechanism alone across multiple models.