ARCA: Adapter-Residual Credit Assignment When Token Signals Degenerate
Researchers propose ARCA, a new token-level credit assignment method for language model reinforcement learning that addresses degradation issues in parameter-efficient fine-tuning approaches like LoRA. By measuring where adapters actually modify hidden states rather than tracking output distribution shifts, ARCA provides non-degenerate credit signals competitive with existing baselines while requiring no additional learned components.
The paper identifies a fundamental technical problem in how modern LLM reinforcement learning assigns credit to individual tokens during training. Most credit assignment methods—surprisal, entropy reduction, and policy divergence—were designed assuming fully trainable policies, but production systems typically use LoRA, which constrains the policy to a low-rank neighborhood around a base model. This mismatch causes existing signals to degenerate, either spreading uniformly across tokens or concentrating on task-irrelevant positions, severely degrading training signal quality.
The research addresses a genuine bottleneck in scaling LLM-RL. LoRA's parameter efficiency makes it practical for fine-tuning billion-parameter models, yet its mathematical constraints weren't properly accounted for in credit assignment design. Prior work treated LoRA as an implementation detail rather than a structural constraint that fundamentally changes how policy changes manifest in output distributions.
ARCA's core insight—measuring adapter salience through hidden-state residuals rather than output changes—elegantly sidesteps the degeneracy problem. By examining where the adapter actually modifies the model's internal representations, the method captures meaningful policy changes that low-rank constraints make visible. The approach requires no auxiliary learning signals, reward models, or value heads, reducing implementation complexity and training overhead.
The empirical validation on MATH with Qwen-1.7B demonstrates ARCA maintains competitive performance while exhibiting predicted non-degenerate credit distributions. For organizations training language models with RL, this offers a practical improvement to training efficiency and signal quality. The work primarily impacts ML researchers and LLM developers rather than cryptocurrency markets, advancing foundational techniques in AI system training.
- →ARCA addresses token credit assignment degeneracy caused by LoRA's low-rank constraints in LLM-RL training
- →Measuring adapter hidden-state residuals provides non-degenerate credit signals without learned reward models
- →Existing credit assignment methods fail under parameter-efficient fine-tuning due to structural misalignment
- →The method shows competitive performance on MATH benchmarks with improved signal distribution properties
- →This advancement benefits LLM training efficiency and reinforcement learning at scale