y0news
← Feed
←Back to feed
🧠 AIβšͺ NeutralImportance 6/10

EASE-TTT: Evidence-Aligned Selective Test-Time Training for Long-Context Question Answering

arXiv – CS AI|Xiaopeng Yuan, Zebin Wang, Suwen Wang, Zongxin Yang, Haohan Wang, Yushun Dong|
πŸ€–AI Summary

Researchers present EASE-TTT, a novel framework combining within-context retrieval with test-time adaptation to improve long-context question answering in smaller language models. The method identifies evidence chunks and converts them into soft attention supervision targets, allowing models to focus on relevant information while processing the full context, outperforming existing retrieval-only and generic adaptation baselines.

Analysis

The research addresses a persistent limitation in language model performance: smaller models struggle with long-context question answering even when relevant evidence exists in the input. Traditional approaches either expose evidence chunks at the input level without adapting model behavior, or apply generic self-supervised training objectives that fail to distinguish which context positions actually support the correct answer. EASE-TTT bridges this gap by creating a hybrid framework that leverages evidence localization to guide the model's attention mechanisms during inference.

This advancement builds on two parallel research trends: the push toward more efficient inference through test-time training adaptation, and the recognition that retrieval-augmented methods improve performance on knowledge-intensive tasks. Previous work in query-only test-time training (qTTT) demonstrated efficiency gains but lacked semantic grounding; EASE-TTT grounds adaptation in actual evidence positions rather than generic span-level objectives.

The framework's practical implications extend across several domains. For developers deploying smaller language models in resource-constrained environments, EASE-TTT offers a computationally efficient alternative to retrieving and processing context truncation. The method generates answers from the full original context rather than replacing it, preserving contextual nuance while improving accuracy. Across six LongBench QA benchmarks with three different decoder-only models, EASE-TTT demonstrates consistent improvements, suggesting broad applicability.

Future development should explore whether evidence-aligned adaptation transfers to other long-context tasks beyond question answering, and whether the framework scales efficiently with model size. The approach may also inspire similar hybrid methods in other domains requiring selective attention over noisy or extensive information.

Key Takeaways
  • β†’EASE-TTT combines evidence retrieval with test-time training adaptation to improve accuracy in long-context QA for smaller language models
  • β†’The framework converts evidence chunks into soft attention supervision targets that guide model adaptation during inference
  • β†’Results across six LongBench tasks show stronger performance than full-context inference, retrieval-only baselines, and generic test-time training methods
  • β†’The approach processes the full original context rather than truncated retrieved chunks, preserving contextual information
  • β†’Evidence-aligned adaptation addresses the semantic grounding gap in previous generic test-time training methods
Read Original β†’via arXiv – CS AI
Act on this with AI
Stay ahead of the market.
Connect your wallet to an AI agent. It reads balances, proposes swaps and bridges across 15 chains β€” you keep full control of your keys.
Connect Wallet to AI β†’How it works
Related Articles