y0news
← Feed
Back to feed
🧠 AI NeutralImportance 6/10

Where to Place the Query? Unveiling and Mitigating Positional Bias in In-Context Learning for Diffusion LLMs via Decoding Dynamics

arXiv – CS AI|Zhengheng Li, Panrui Li, Xuyang Liu, Puzhi Xia|
🤖AI Summary

Researchers demonstrate that query placement significantly impacts performance in Diffusion Large Language Models (dLLMs) during in-context learning, contrary to conventional practices inherited from autoregressive models. The study reveals a spatial recency effect in attention mechanisms and proposes Auto-ICL, a training-free strategy that dynamically optimizes query positioning to approach oracle performance across diverse tasks.

Analysis

This research addresses a fundamental architectural difference between diffusion-based and autoregressive language models that has been overlooked in practical applications. While autoregressive models enforce unidirectional causal masking that constrains query placement, diffusion models leverage bidirectional attention, enabling flexible spatial positioning. The study's core finding—that query position rivals semantic quality in importance—suggests the field has been leaving substantial performance gains on the table by mechanically applying AR-derived templates to fundamentally different architectures.

The root cause identified through decoding dynamics analysis reveals a spatial recency effect where attention patterns shift based on query location, with downstream effects on generation trajectories. This positions query placement as a first-order design variable rather than a minor implementation detail. The proposed Average Confidence metric represents a methodological contribution addressing the inadequacy of traditional confidence scoring in iterative decoding processes, capturing the cumulative information flow across multiple inference steps.

For the AI development community, this work has meaningful implications for model optimization without requiring retraining or labeled data. Auto-ICL's training-free approach democratizes access to performance improvements, making it broadly applicable across existing dLLM deployments. The framework's robustness across heterogeneous tasks—reasoning and perception—suggests architectural principles that could generalize beyond current model families. Practitioners implementing or evaluating diffusion language models should reconsider their prompt engineering conventions, as naive query placement likely degrades performance by measurable margins. The research establishes foundational baselines for spatial in-context learning that future work can build upon, potentially reshaping best practices in prompt design for bidirectional attention models.

Key Takeaways
  • Query position is a first-order performance variable in diffusion LLMs, with impact comparable to example semantic quality.
  • Spatial recency effects in attention mechanisms cause positional sensitivity that varies across different task types.
  • Traditional single-step confidence metrics fail to capture decoding dynamics in diffusion models; Average Confidence provides better calibration.
  • Auto-ICL offers a training-free solution that dynamically optimizes query placement and approaches oracle performance without ground-truth labels.
  • Current practices inappropriately transfer autoregressive prompt templates to diffusion models despite fundamental architectural differences in attention mechanisms.
Read Original →via arXiv – CS AI
Act on this with AI
Stay ahead of the market.
Connect your wallet to an AI agent. It reads balances, proposes swaps and bridges across 15 chains — you keep full control of your keys.
Connect Wallet to AI →How it works
Related Articles