Loong: A Human-Like Long Document Translation Agent with Observe-and-Act Adaptive Context Selection
Researchers introduce Loong, an AI agent designed to improve long document translation by selectively retrieving relevant context from a 3E memory module rather than processing all available information. The system uses reinforcement learning to optimize context selection and demonstrates significant translation quality improvements across multiple language pairs, achieving gains up to 13 points on standard evaluation metrics.
Loong addresses a fundamental constraint in large language model translation: the tension between insufficient context for coherence and excessive context that introduces noise. Traditional approaches either truncate documents to fit context windows or attempt to process entire documents indiscriminately, both strategies producing suboptimal results. The 3E memory architecture—storing essences, exemplars, and entities—mirrors human cognitive approaches to long-form translation by maintaining structured historical information rather than raw text.
This work builds on growing research into agentic AI systems that reason through complex tasks step-by-step rather than relying on single-pass inference. The observe-and-act framework enables the model to iteratively assess which historical context becomes most relevant for each translation decision, fundamentally shifting from passive context window management to active reasoning-based selection. Reinforcement learning optimization from self-generated trajectories represents a practical approach to scaling training without requiring massive human annotation efforts.
For the translation and language technology sector, this represents meaningful progress toward domain-specific document handling without architectural changes to underlying models. Enterprises managing multilingual content—particularly in technical, legal, or financial domains where consistency and global coherence matter significantly—could benefit from improved translation quality without retraining or fine-tuning expensive foundation models.
The emphasis on generalization across domains and robustness against noise suggests practical deployment potential. Future development likely focuses on scaling to additional language pairs and investigating whether similar adaptive context mechanisms improve other long-context tasks like summarization, question-answering, or code generation. The open-source release enables broader community validation and extension.
- →Loong uses a 3E memory module to selectively retrieve relevant context rather than processing entire documents, improving translation quality by up to 13 points
- →The system employs reinforcement learning to optimize context selection through observe-and-act reasoning trajectories generated during inference
- →Empirical results show strong generalization across domains, multiple language pairs, and robustness against contextual noise in ultra-long documents
- →The approach addresses the core constraint of limited context windows in LLMs without requiring architectural modifications to foundation models
- →Open-source release enables broader adoption and extension for enterprise document translation and potentially other long-context reasoning tasks