#context-compression News & Analysis

9 articles tagged with #context-compression. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

9 articles

AIBearisharXiv – CS AI · Jun 237/10

🧠

Governance Decay: How Context Compaction Silently Erases Safety Constraints in Long-Horizon LLM Agents

Researchers discover that LLM agents lose safety compliance when governance constraints are compressed or summarized during long sessions, with violations rising from 0% to 59% after context compaction. The study introduces a benchmark demonstrating this 'Governance Decay' failure mode and proposes Constraint Pinning as a training-free mitigation.

AIBullisharXiv – CS AI · Jun 57/10

🧠

ABBEL: Learning Natural-Language Belief States for Memory-Efficient Interaction

ABBEL is a new recursive summarization framework that enables AI agents to maintain memory-efficient interaction histories by storing information as natural-language belief states rather than full context. The approach uses reinforcement learning techniques to improve belief generation quality, achieving 40% better performance than prior memory-constrained agents while using 67% less memory.

AIBullisharXiv – CS AI · Jun 27/10

🧠

ACON: Optimizing Context Compression for Long-horizon LLM Agents

Researchers introduce ACON, a framework that compresses long-context information for LLM agents without model fine-tuning, reducing token usage by 26-54% while improving task success rates. The method optimizes compression through natural language refinement and enables smaller language models to function effectively as long-horizon agents.

AIBullisharXiv – CS AI · May 297/10

🧠

Less Is More: Elevating RAG via Performance-Driven Context Compression

Researchers introduce CORE-RAG, a novel framework that compresses context in Retrieval-Augmented Generation systems using performance-driven learning rather than predefined heuristics. The approach achieves a 97% compression ratio while improving accuracy by 3.3 points on exact match scores, addressing a critical bottleneck in LLM efficiency.

AIBullisharXiv – CS AI · May 287/10

🧠

ZipRL: Adaptive Multi-Turn Context Compression with Hindsight Response Replay

Researchers introduce ZipRL, an adaptive context compression framework that uses reinforcement learning to efficiently reduce token usage in multi-turn LLM agent tasks while preserving task-critical information. The method incorporates Hindsight Response Replay to address sparse reward problems and demonstrates 27-35% performance improvements over existing approaches on benchmark tasks.

AIBullisharXiv – CS AI · May 287/10

🧠

Thinking as Compression: Your Reasoning Model is Secretly a Context Compressor

Researchers introduce Thinking as Compression (TaC), a novel approach that leverages language model reasoning traces as a natural context compression mechanism without requiring dedicated compression modules. The method demonstrates significant performance gains, outperforming existing compression baselines by 17-23% across long-context QA benchmarks at high compression ratios.

AIBullisharXiv – CS AI · May 277/10

🧠

Tool-Schema Compression Enables Agentic RAG Under Constrained Context Budgets

Researchers demonstrate that tool-schema compression reduces token consumption by 44-50%, enabling large language model agents to function under tight context constraints. Testing across 14 models shows compressed schemas restore RAG functionality with +20.5 percentage point exact-match improvements at 8K tokens, while frontier models can now handle 800+ tools instead of ~494.

AIBullisharXiv – CS AI · Apr 147/10

🧠

MEMENTO: Teaching LLMs to Manage Their Own Context

Researchers introduce MEMENTO, a method enabling large language models to compress their reasoning into dense summaries (mementos) organized into blocks, reducing KV cache usage by 2.5x and improving throughput by 1.75x while maintaining accuracy. The technique is validated across multiple model families using OpenMementos, a new dataset of 228K annotated reasoning traces.

AIBullisharXiv – CS AI · Jun 36/10

🧠

Perceive Before Reasoning: A Pre-Reasoning Perception Framework for Efficient and Reliable Proactive Mobile Agents

Researchers propose the Pre-Reasoning Perception Framework (PRPF), a two-stage system that improves mobile agent efficiency by separating intervention detection from task reasoning. The framework uses a lightweight perceptor to decide when assistance is needed before activating a larger reasoning model, reducing false triggers and computational overhead.