Emergence of Context Characteristics Sensitivity in Large Language Models
Researchers studied how large language models develop sensitivity to context characteristics during instruction fine-tuning across three stages: supervised fine-tuning, direct preference optimization, and reinforcement learning. The study found that models progressively learn to favor easily understandable contexts with high length and similarity to queries, with subsequent training stages either reinforcing or resolving these preferences based on dataset design.