12 articles tagged with #token-efficiency. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.
AIBullisharXiv โ CS AI ยท 3d ago7/10
๐ง Researchers introduce ContextCurator, a reinforcement learning-based framework that decouples context management from task execution in LLM agents, addressing the context bottleneck problem. The approach pairs a lightweight specialized policy model with a frozen foundation model, achieving significant improvements in success rates and token efficiency across benchmark tasks.
๐ง GPT-4๐ง Gemini
AIBullisharXiv โ CS AI ยท Apr 107/10
๐ง Researchers demonstrate that large speech language models contain significant redundancy in their token representations, particularly in deeper layers. By introducing Affinity Pooling, a training-free token merging technique, they achieve 27.48% reduction in prefilling FLOPs and up to 1.7ร memory savings while maintaining semantic accuracy, challenging the necessity of fully distinct tokens for acoustic processing.
AIBullisharXiv โ CS AI ยท Apr 67/10
๐ง JoyAI-LLM Flash is a new efficient Mixture-of-Experts language model with 48B parameters that activates only 2.7B per forward pass, trained on 20 trillion tokens. The model introduces FiberPO, a novel reinforcement learning algorithm, and achieves higher sparsity ratios than comparable industry models while being released open-source on Hugging Face.
๐ข Hugging Face
AIBullisharXiv โ CS AI ยท Mar 47/104
๐ง Researchers propose an Adaptive Social Learning (ASL) framework with Adaptive Mode Policy Optimization (AMPO) algorithm to improve language agents' reasoning abilities in social interactions. The system dynamically adjusts reasoning depth based on context, achieving 15.6% higher performance than GPT-4o while using 32.8% shorter reasoning chains.
AIBullisharXiv โ CS AI ยท 2d ago6/10
๐ง Researchers propose Heuristic Classification of Thoughts (HCoT), a novel prompting method that integrates expert system heuristics into large language models to improve structured reasoning on complex problems. The approach addresses LLMs' stochastic token generation and decoupled reasoning mechanisms by using heuristic classification to guide and optimize decision trajectories, demonstrating superior performance and token efficiency compared to existing methods like Chain-of-Thoughts and Tree-of-Thoughts prompting.
AIBullisharXiv โ CS AI ยท 4d ago6/10
๐ง Researchers present PETITE, a tutor-student multi-agent framework that enhances LLM problem-solving by assigning complementary roles to agents from the same model. Evaluated on coding benchmarks, the approach achieves comparable or superior accuracy to existing methods while consuming significantly fewer tokens, demonstrating that structured role-differentiated interactions can improve LLM performance more efficiently than larger models or heterogeneous ensembles.
AIBullisharXiv โ CS AI ยท Mar 266/10
๐ง SafeSieve is a new algorithm for optimizing LLM-based multi-agent systems that reduces token usage by 12.4%-27.8% while maintaining 94.01% accuracy. The progressive pruning method combines semantic evaluation with performance feedback to eliminate redundant communication between AI agents.
AIBullisharXiv โ CS AI ยท Mar 116/10
๐ง Researchers present LLM Delegate Protocol (LDP), a new AI-native communication protocol for multi-agent LLM systems that introduces identity awareness, progressive payloads, and governance mechanisms. The protocol achieves 12x lower latency on simple tasks and 37% token reduction compared to existing protocols like A2A, though quality improvements remain limited in small delegate pools.
AIBullisharXiv โ CS AI ยท Mar 37/108
๐ง Researchers propose MemPO (Self-Memory Policy Optimization), a new algorithm that enables AI agents to autonomously manage their memory during long-horizon tasks. The method achieves significant performance improvements with 25.98% F1 score gains over base models while reducing token usage by 67.58%.
AIBullisharXiv โ CS AI ยท Mar 37/107
๐ง Researchers have developed Semantic XPath, a tree-structured memory system for conversational AI that improves performance by 176.7% over traditional methods while using only 9.1% of the tokens. The system addresses scalability issues in long-term AI conversations by efficiently accessing and updating structured memory instead of appending growing conversation history.
AIBullisharXiv โ CS AI ยท Mar 36/103
๐ง Researchers introduce SupervisorAgent, a lightweight framework that reduces token consumption in Multi-Agent Systems by 29.68% while maintaining performance. The system provides real-time supervision and error correction without modifying base agent architectures, validated across multiple AI benchmarks.
AINeutralarXiv โ CS AI ยท Mar 54/10
๐ง A benchmark study compares Token-Oriented Object Notation (TOON) with JSON for structured data serialization in LLMs, finding that while TOON reduces token usage, plain JSON shows better accuracy overall. The research reveals that TOON's efficiency benefits may only emerge at scale where syntax savings offset the initial prompt overhead.