y0news
AnalyticsDigestsSourcesTopicsRSSAICrypto

#token-compression News & Analysis

8 articles tagged with #token-compression. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

8 articles
AIBullisharXiv โ€“ CS AI ยท Apr 77/10
๐Ÿง 

LightThinker++: From Reasoning Compression to Memory Management

Researchers developed LightThinker++, a new framework that enables large language models to compress intermediate reasoning thoughts and manage memory more efficiently. The system reduces peak token usage by up to 70% while improving accuracy by 2.42% and maintaining performance over extended reasoning tasks.

AIBullisharXiv โ€“ CS AI ยท Apr 66/10
๐Ÿง 

Token-Efficient Multimodal Reasoning via Image Prompt Packaging

Researchers introduce Image Prompt Packaging (IPPg), a technique that embeds text directly into images to reduce multimodal AI inference costs by 35.8-91.0% while maintaining competitive accuracy. The method shows significant promise for cost optimization in large multimodal language models, though effectiveness varies by model and task type.

๐Ÿง  GPT-4๐Ÿง  Claude
AIBullisharXiv โ€“ CS AI ยท Mar 276/10
๐Ÿง 

Photon: Speedup Volume Understanding with Efficient Multimodal Large Language Models

Photon is a new framework that efficiently processes 3D medical imaging for AI visual question answering by using variable-length token sequences and adaptive compression. The system reduces computational costs while maintaining accuracy through instruction-conditioned token scheduling and custom gradient propagation techniques.

AIBullisharXiv โ€“ CS AI ยท Mar 166/10
๐Ÿง 

Structured Distillation for Personalized Agent Memory: 11x Token Reduction with Retrieval Preservation

Researchers developed a structured distillation method that compresses AI agent conversation history by 11x (from 371 to 38 tokens per exchange) while maintaining 96% of retrieval quality. The technique enables thousands of exchanges to fit within a single prompt at 1/11th the context cost, addressing the expensive verbatim storage problem for long AI conversations.

AIBullisharXiv โ€“ CS AI ยท Mar 36/107
๐Ÿง 

TC-SSA: Token Compression via Semantic Slot Aggregation for Gigapixel Pathology Reasoning

Researchers propose TC-SSA, a token compression framework that enables large vision-language models to process gigapixel pathology images by reducing visual tokens to 1.7% of original size while maintaining diagnostic accuracy. The method achieves 78.34% overall accuracy on SlideBench and demonstrates strong performance across multiple cancer classification tasks.

AIBullisharXiv โ€“ CS AI ยท Mar 36/104
๐Ÿง 

Contribution-aware Token Compression for Efficient Video Understanding via Reinforcement Learning

Researchers developed CaCoVID, a reinforcement learning-based algorithm that compresses video tokens for large language models by selecting tokens based on their actual contribution to correct predictions rather than attention scores. The method uses combinatorial policy optimization to reduce computational overhead while maintaining video understanding performance.

AINeutralarXiv โ€“ CS AI ยท Mar 34/104
๐Ÿง 

EfficientPosterGen: Semantic-aware Efficient Poster Generation via Token Compression and Accurate Violation Detection

Researchers introduce EfficientPosterGen, an AI framework that automatically converts research papers into academic posters using semantic-aware retrieval and token compression techniques. The system addresses key limitations of existing multimodal language models by reducing token consumption while maintaining high-quality poster generation through innovative visual-based context compression and deterministic layout violation detection.