FIDES: Faithful Inference via Deep Evidence Signals for Retrieval-Memory Conflict in RAG
FIDES is a training-free decoder that improves how language models handle conflicts between retrieved evidence and internal knowledge by applying selective, token-level corrections rather than uniform adjustments. The method achieves up to 92-94% context fidelity across multiple model scales, demonstrating that targeted intervention at critical decoding points outperforms existing contrastive decoding approaches.
FIDES addresses a fundamental limitation in retrieval-augmented generation (RAG) systems where language models ignore retrieved context when it contradicts their parametric memory. Existing contrastive decoding methods apply uniform penalties across all tokens, which over-corrects safe predictions while insufficiently addressing genuinely conflicted decision points. The research identifies that retrieval-memory tension concentrates sharply on specific, answer-critical steps rather than distributing evenly throughout generation.
The proposed approach represents a methodological shift in how conflict resolution is conceptualized within RAG pipelines. Rather than tuning a single global contrastive weight, FIDES leverages three complementary internal signals—output surface features, hidden layer representations, and prediction trajectory patterns—to measure conflict depth at each decoding step. This multi-signal fusion enables granular, token-specific intervention strength calibration without requiring additional training.
The empirical validation spans 18 different configurations across model scales from 7B to 70B parameters, demonstrating consistent improvements of 3-13 points over existing baselines in context fidelity metrics. The 70B results showing 92-94% fidelity alongside 62-63% F1 scores indicate that token-level selectivity unlocks generation quality that coarse contrastive rules suppress. This advancement matters for production RAG systems where balance between following external evidence and maintaining coherent generation is critical for reliability.
The training-free nature of FIDES makes it immediately deployable across existing model architectures without fine-tuning overhead. Future work likely extends this selective intervention framework to other decoding challenges beyond retrieval-memory conflicts, potentially influencing broader standards for context-aware language model inference.
- →FIDES achieves 92-94% context fidelity on 70B models by applying token-level selective corrections rather than uniform contrastive penalties
- →The method probes three internal signal depths to measure retrieval-memory conflict at specific decoding steps, improving over single global weight approaches
- →Outperforms strongest training-free baselines by 3-13 points across 18 evaluation settings spanning multiple model scales
- →Training-free decoder design enables immediate deployment across existing architectures without fine-tuning requirements
- →Token-level selectivity unlocks 62-63% F1 scores that uniform contrastive rules suppress by over-correcting non-critical tokens