AIBullisharXiv – CS AI · May 127/10
🧠Researchers introduce DUET, a method for optimizing token allocation in reinforcement learning with verifiable rewards that jointly controls which prompts receive rollouts and how long each rollout runs. The technique achieves superior reasoning quality on math and coding benchmarks while using 50% fewer tokens than baseline methods, suggesting efficiency gains don't require sacrificing model performance.
🧠 Llama
AIBullisharXiv – CS AI · May 47/10
🧠Researchers introduce A11y-Compressor, a framework that optimizes how AI agents interpret graphical user interfaces by transforming accessibility trees into more efficient representations. The approach reduces input tokens by 78% while simultaneously improving task success rates by 5.1 percentage points, addressing a critical bottleneck in GUI automation systems.
AIBullisharXiv – CS AI · Mar 37/104
🧠Researchers introduce SwiReasoning, a training-free framework that improves large language model reasoning by dynamically switching between explicit chain-of-thought and latent reasoning modes. The method achieves 1.8%-3.1% accuracy improvements and 57%-79% better token efficiency across mathematics, STEM, coding, and general benchmarks.
AINeutralarXiv – CS AI · 3d ago6/10
🧠Researchers benchmark token-optimized data formats (TRON and TOON) against JSON in agentic AI systems, finding TRON reduces token consumption by up to 27% with acceptable accuracy trade-offs. The study reveals that while these alternatives show promise in isolated tasks, their real-world performance in multi-turn agent loops exposes limitations, particularly with TOON's parsing cascades and parallel tool-call handling.
AINeutralarXiv – CS AI · May 126/10
🧠Researchers introduce RADAR, a framework that optimizes multi-agent LLM communication structures through adaptive diffusion models, reducing token consumption while improving task accuracy. The approach moves beyond fixed communication topologies to enable dynamic, task-specific agent coordination across diverse computational problems.
AIBullisharXiv – CS AI · Apr 76/10
🧠ANX is a new protocol-first framework designed for AI agent interaction, featuring a 3EX decoupled architecture that reduces token consumption by up to 66% compared to existing methods. The open-source protocol addresses security and efficiency issues in current AI agent implementations through agent-native design and integrated CLI, Skill, and MCP components.
🧠 GPT-4
AIBullisharXiv – CS AI · Apr 76/10
🧠Research reveals that multi-agent LLM committees suffer from 'representational collapse' where agents produce highly similar outputs despite different role prompts, with mean cosine similarity of 0.888. A new diversity-aware consensus protocol (DALC) improves accuracy to 87% while reducing token costs by 26% compared to traditional self-consistency methods.
AIBullisharXiv – CS AI · Mar 36/106
🧠Researchers introduce One-Token Verification (OTV), a new method that estimates reasoning correctness in large language models during a single forward pass, reducing computational overhead. OTV reduces token usage by up to 90% through early termination while improving accuracy on mathematical reasoning tasks compared to existing verification methods.
AIBullisharXiv – CS AI · Mar 34/103
🧠Researchers propose I-LLMRec, a new method for AI recommender systems that uses images instead of lengthy text descriptions to represent items, reducing computational token usage while maintaining recommendation quality. The approach leverages the information overlap between images and descriptions to create more efficient and robust LLM-based recommendation systems.