y0news
AnalyticsDigestsSourcesTopicsRSSAICrypto

#token-efficiency News & Analysis

12 articles tagged with #token-efficiency. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

12 articles
AIBullisharXiv โ€“ CS AI ยท 3d ago7/10
๐Ÿง 

Escaping the Context Bottleneck: Active Context Curation for LLM Agents via Reinforcement Learning

Researchers introduce ContextCurator, a reinforcement learning-based framework that decouples context management from task execution in LLM agents, addressing the context bottleneck problem. The approach pairs a lightweight specialized policy model with a frozen foundation model, achieving significant improvements in success rates and token efficiency across benchmark tasks.

๐Ÿง  GPT-4๐Ÿง  Gemini
AIBullisharXiv โ€“ CS AI ยท Apr 107/10
๐Ÿง 

Do We Need Distinct Representations for Every Speech Token? Unveiling and Exploiting Redundancy in Large Speech Language Models

Researchers demonstrate that large speech language models contain significant redundancy in their token representations, particularly in deeper layers. By introducing Affinity Pooling, a training-free token merging technique, they achieve 27.48% reduction in prefilling FLOPs and up to 1.7ร— memory savings while maintaining semantic accuracy, challenging the necessity of fully distinct tokens for acoustic processing.

AIBullisharXiv โ€“ CS AI ยท Apr 67/10
๐Ÿง 

JoyAI-LLM Flash: Advancing Mid-Scale LLMs with Token Efficiency

JoyAI-LLM Flash is a new efficient Mixture-of-Experts language model with 48B parameters that activates only 2.7B per forward pass, trained on 20 trillion tokens. The model introduces FiberPO, a novel reinforcement learning algorithm, and achieves higher sparsity ratios than comparable industry models while being released open-source on Hugging Face.

๐Ÿข Hugging Face
AIBullisharXiv โ€“ CS AI ยท Mar 47/104
๐Ÿง 

Adaptive Social Learning via Mode Policy Optimization for Language Agents

Researchers propose an Adaptive Social Learning (ASL) framework with Adaptive Mode Policy Optimization (AMPO) algorithm to improve language agents' reasoning abilities in social interactions. The system dynamically adjusts reasoning depth based on context, achieving 15.6% higher performance than GPT-4o while using 32.8% shorter reasoning chains.

AIBullisharXiv โ€“ CS AI ยท 2d ago6/10
๐Ÿง 

Heuristic Classification of Thoughts Prompting (HCoT): Integrating Expert System Heuristics for Structured Reasoning into Large Language Models

Researchers propose Heuristic Classification of Thoughts (HCoT), a novel prompting method that integrates expert system heuristics into large language models to improve structured reasoning on complex problems. The approach addresses LLMs' stochastic token generation and decoupled reasoning mechanisms by using heuristic classification to guide and optimize decision trajectories, demonstrating superior performance and token efficiency compared to existing methods like Chain-of-Thoughts and Tree-of-Thoughts prompting.

AIBullisharXiv โ€“ CS AI ยท 4d ago6/10
๐Ÿง 

Enhancing LLM Problem Solving via Tutor-Student Multi-Agent Interaction

Researchers present PETITE, a tutor-student multi-agent framework that enhances LLM problem-solving by assigning complementary roles to agents from the same model. Evaluated on coding benchmarks, the approach achieves comparable or superior accuracy to existing methods while consuming significantly fewer tokens, demonstrating that structured role-differentiated interactions can improve LLM performance more efficiently than larger models or heterogeneous ensembles.

AIBullisharXiv โ€“ CS AI ยท Mar 116/10
๐Ÿง 

LDP: An Identity-Aware Protocol for Multi-Agent LLM Systems

Researchers present LLM Delegate Protocol (LDP), a new AI-native communication protocol for multi-agent LLM systems that introduces identity awareness, progressive payloads, and governance mechanisms. The protocol achieves 12x lower latency on simple tasks and 37% token reduction compared to existing protocols like A2A, though quality improvements remain limited in small delegate pools.

AIBullisharXiv โ€“ CS AI ยท Mar 37/108
๐Ÿง 

MemPO: Self-Memory Policy Optimization for Long-Horizon Agents

Researchers propose MemPO (Self-Memory Policy Optimization), a new algorithm that enables AI agents to autonomously manage their memory during long-horizon tasks. The method achieves significant performance improvements with 25.98% F1 score gains over base models while reducing token usage by 67.58%.

AIBullisharXiv โ€“ CS AI ยท Mar 37/107
๐Ÿง 

Semantic XPath: Structured Agentic Memory Access for Conversational AI

Researchers have developed Semantic XPath, a tree-structured memory system for conversational AI that improves performance by 176.7% over traditional methods while using only 9.1% of the tokens. The system addresses scalability issues in long-term AI conversations by efficiently accessing and updating structured memory instead of appending growing conversation history.

AIBullisharXiv โ€“ CS AI ยท Mar 36/103
๐Ÿง 

Stop Wasting Your Tokens: Towards Efficient Runtime Multi-Agent Systems

Researchers introduce SupervisorAgent, a lightweight framework that reduces token consumption in Multi-Agent Systems by 29.68% while maintaining performance. The system provides real-time supervision and error correction without modifying base agent architectures, validated across multiple AI benchmarks.

AINeutralarXiv โ€“ CS AI ยท Mar 54/10
๐Ÿง 

Token-Oriented Object Notation vs JSON: A Benchmark of Plain and Constrained Decoding Generation

A benchmark study compares Token-Oriented Object Notation (TOON) with JSON for structured data serialization in LLMs, finding that while TOON reduces token usage, plain JSON shows better accuracy overall. The research reveals that TOON's efficiency benefits may only emerge at scale where syntax savings offset the initial prompt overhead.