AIBullisharXiv – CS AI · 4d ago7/10
🧠Researchers introduce IntentKV, a learned KV cache pruning technique that optimizes memory usage for multi-turn LLM agents without modifying the base model. The method achieves 23-30% reductions in peak request tokens and up to 92.6% fewer KV reads under tight memory budgets, addressing a critical bottleneck in long-horizon agent inference.
AIBullisharXiv – CS AI · 5d ago7/10
🧠FMplex is a new model-serving system that enables multiple downstream tasks to share a single foundation model backbone through virtualization, reducing memory waste and computational costs. The system achieves up to 80% latency reduction compared to traditional spatial partitioning approaches while enabling clusters to host 6x more tasks simultaneously.
🏢 Meta
AIBullisharXiv – CS AI · Jun 57/10
🧠Vortex is a new system that simplifies the development and deployment of sparse attention algorithms for large language models, enabling researchers and AI agents to rapidly prototype and evaluate efficiency improvements. The platform demonstrates substantial real-world performance gains, with optimized algorithms achieving up to 3.46× higher throughput than full attention while maintaining accuracy, and successfully extending sparse attention to emerging model architectures.
🏢 Nvidia
AIBearisharXiv – CS AI · Jun 47/10
🧠Researchers introduce MAMA, a framework measuring how network topology affects private information leakage in multi-agent LLM systems. The study demonstrates that denser connectivity and shorter distances between attackers and targets significantly increase memory leakage, with practical implications for securing distributed AI systems.
AIBullisharXiv – CS AI · Jun 27/10
🧠Researchers propose the Intelligent Computing Architecture Model (ICAM), a six-layer framework that applies classical computer architecture principles to large language models and agentic AI systems. The paper maps recurring engineering challenges—cache reuse, context management, agent scheduling, and permission control—to traditional systems problems, introducing three design laws to optimize model-native computing efficiency and coordination.
🧠 Claude
AINeutralarXiv – CS AI · Jun 17/10
🧠Researchers demonstrate that restructuring communication topology in multi-robot systems yields significantly larger performance improvements than scaling individual model sizes, with hierarchical interaction design improving performance by 47 points versus 9 points from doubling neural network capacity. This finding challenges the conventional focus on model scaling in AI systems and suggests interaction architecture may be equally or more critical for coordinated multi-agent performance.
AIBullisharXiv – CS AI · May 297/10
🧠Researchers introduce SALE (Strategy Auctions for Workload Efficiency), a framework that coordinates multiple small language model agents through a bidding mechanism to match or exceed the performance of large models while reducing costs by 35% and cutting reliance on the largest agent by 52%. The approach demonstrates that smaller AI agents can be effectively scaled for complex tasks through intelligent task allocation rather than relying solely on larger models.
AIBullisharXiv – CS AI · May 287/10
🧠Researchers present a systematic study of Attention-FFN Disaggregation (AFD), a technique that separates attention and expert layers across different GPU groups to optimize inference serving for Mixture-of-Experts language models. The framework demonstrates that AFD enables 4k tokens/s throughput on DeepSeek-V3.2 under strict latency constraints where traditional disaggregation approaches fail, providing design principles for scaling LLM infrastructure.
AIBullisharXiv – CS AI · May 127/10
🧠SynerDiff is a new continuous batching system for diffusion model inference that addresses resource contention issues between UNet and VAE components. The system achieves 1.6× throughput improvement and up to 78.7% latency reduction through intra-level and inter-level optimization strategies, enabling faster AI-generated content services.
AIBearisharXiv – CS AI · Apr 207/10
🧠Researchers document a case study where a user's custom LLM system designed for self-regulation inadvertently caused loss of agency within 48 hours due to architectural flaws in prompt isolation. The study identifies context contamination and metacognitive co-option as failure mechanisms and proposes physical rather than logical isolation as a solution, raising critical ethical questions about protective versus restrictive AI system design.
AIBullisharXiv – CS AI · Mar 117/10
🧠MASEval introduces a new framework-agnostic evaluation library for multi-agent AI systems that treats entire systems rather than just models as the unit of analysis. Research across 3 benchmarks, models, and frameworks reveals that framework choice impacts performance as much as model selection, challenging current model-centric evaluation approaches.
AINeutralGoogle Research Blog · Jan 287/106
🧠The article discusses the scientific principles behind scaling agent systems in generative AI, examining the conditions and factors that determine when agent systems perform effectively. It appears to focus on understanding the theoretical foundations for building and deploying AI agent systems at scale.
AINeutralarXiv – CS AI · 4d ago5/10
🧠Researchers propose a Bayesian Network-based Decision Support System (DSS) to help infrastructure operators select appropriate security tools across heterogeneous open-source networks. The framework addresses the growing complexity of managing interconnected systems by automating the matching of high-level security requirements to suitable mechanisms.
GeneralBullishFortune Crypto · 6d ago6/10
📰On America's 250th anniversary, the article argues that the nation's greatest competitive advantage has never been a physical product or resource, but rather a systemic framework that empowers individuals to innovate and build without requiring prior permission from authorities. This foundational principle of permissionless innovation has been central to American economic and technological leadership.
AINeutralarXiv – CS AI · Jun 36/10
🧠Researchers introduce GAMBLe, a framework for analyzing AI-Driven Research Systems (ADRS) that couple large language models with automated evaluation. Through 760+ experiments, the framework reveals that standard convergence guarantees fail to capture ADRS behavior, and component selection can improve performance by 13-67% depending on the problem.
AINeutralarXiv – CS AI · Jun 26/10
🧠Researchers propose a practical framework for building LLM-based agentic systems that prioritizes simplicity, cost predictability, and controllability over maximum optimization. The framework uses modular "pseudo-tools" and fixed workflows, demonstrating that hand-engineered agents often outperform dynamically-planned systems in production environments.
AINeutralarXiv – CS AI · Jun 25/10
🧠A research paper evaluates dynamic coordination strategy selection for enterprise multi-agent systems across 1,440 test cases, finding that while optimal strategies vary by problem class, no single coordination approach consistently outperforms others. The study recommends dynamic routing as a calibrated default rather than deterministic winner-selection, challenging the assumption that fixed global coordination policies suit all enterprise tasks.
🏢 OpenAI
AINeutralarXiv – CS AI · Jun 26/10
🧠Researchers propose Post-Deterministic Distributed Systems (PDDS), a new framework for coordinating infrastructure where autonomous agents, stochastic models, and deterministic code coexist—challenging decades-old assumptions in distributed computing that relied on predictable, deterministic participant behavior.
AINeutralarXiv – CS AI · May 296/10
🧠Researchers define 'Agentic Technical Debt' as governance liabilities arising from rapidly deployed AI agent systems that lack proper validation and standardization. The paper distinguishes this from traditional technical debt and introduces 'Stochastic Tax' as the ongoing operational cost of managing probabilistic agent behavior, proposing lightweight dashboards and controls to address these challenges.
AINeutralarXiv – CS AI · May 16/10
🧠Research demonstrates that for procedural tasks, simple in-context prompting with complete procedures in the system prompt outperforms complex agent orchestration frameworks like LangGraph and CrewAI. Testing across three domains showed the simpler approach achieved 4.53-5.00 quality scores versus 4.17-4.84 for orchestrated systems, with failure rates 50-76% lower, suggesting advances in frontier LLM capabilities have eliminated the need for external orchestration.
🏢 OpenAI
AINeutralarXiv – CS AI · Apr 146/10
🧠ClawVM is a virtual memory management system designed for stateful LLM agents that addresses critical failures in current context window management. The system implements typed pages, multi-resolution representations, and validated writeback protocols to ensure deterministic state residency and durability, adding minimal computational overhead.
AINeutralarXiv – CS AI · Apr 146/10
🧠A theoretical research paper examines Promise Theory as a framework for understanding cooperation between human and machine agents in autonomous systems. The work revisits established principles of agent cooperation to address how diverse components—humans, hardware, software, and AI—maintain alignment with intended purposes through signaling, trust, and feedback mechanisms.