#ai-architecture News & Analysis

72 articles tagged with #ai-architecture. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

72 articles

AIBullisharXiv – CS AI · 2d ago7/10

🧠

Small Agent Group is the Future of Digital Health

Researchers propose Small Agent Group (SAG), a collaborative multi-agent approach to clinical AI that outperforms single large language models while reducing deployment costs and improving reliability. The study challenges the prevailing 'scaling-first' philosophy in digital health, suggesting that distributed reasoning across specialized agents can achieve superior clinical outcomes more efficiently.

AIBullisharXiv – CS AI · 2d ago7/10

🧠

SkillsInjector: Dynamic Skill Context Construction for LLM Agents

SkillsInjector introduces a dynamic method for optimizing how large language model agents access and utilize skill libraries. Rather than treating skill selection as static, the approach adaptively determines which skills to include, how many to present, and how to describe them based on task requirements, achieving measurable performance improvements across multiple benchmarks.

AINeutralarXiv – CS AI · 4d ago7/10

🧠

Position: AI Safety Requires Effective Controllability

Researchers propose that AI safety requires controllability as a core objective alongside alignment, arguing that well-behaved AI systems can still fail to respond to human override commands in real-world deployment scenarios. They introduce ControlBench, a benchmark demonstrating that current safeguards inadequately ensure runtime control, and propose architectural principles including explicit control planes and intervention pathways for future AI systems.

AIBullisharXiv – CS AI · 4d ago7/10

🧠

MUSE-Autoskill: Self-Evolving Agents via Skill Creation, Memory, Management, and Evaluation

Researchers propose MUSE-Autoskill, a framework enabling LLM agents to autonomously create, store, and refine reusable skills throughout their operational lifecycle. The system treats skills as long-lived, testable assets with integrated memory and evaluation mechanisms, demonstrating improved task success rates and cross-agent knowledge transfer on benchmark tests.

AIBullisharXiv – CS AI · May 127/10

🧠

SAFformer:Improving Spiking Transformer via Active Predictive Filtering

Researchers introduce SAFformer, a novel Spiking Transformer architecture that improves energy efficiency and accuracy by adopting an active predictive filtering paradigm inspired by brain mechanisms. The model achieves state-of-the-art performance on image recognition benchmarks while consuming significantly less power than conventional approaches.

AIBullisharXiv – CS AI · May 127/10

🧠

When to Re-Commit: Temporal Abstraction Discovery for Long-Horizon Vision-Language Reasoning

Researchers introduce a learnable approach to commitment depth—the number of primitive actions executed before replanning—in vision-language models for long-horizon reasoning. Their adaptive policy outperforms fixed-depth baselines and surpasses GPT-4.5 and Claude Sonnet on puzzle-solving tasks, achieving higher solve rates with fewer actions.

🧠 GPT-5🧠 Claude

AIBullisharXiv – CS AI · May 117/10

🧠

Tools as Continuous Flow for Evolving Agentic Reasoning

Researchers propose FlowAgent, a novel approach that reconceptualizes how Large Language Models orchestrate tools by treating tool chaining as continuous trajectory generation rather than step-wise execution. The method uses conditional flow matching to provide global planning perspectives, demonstrating improved robustness and generalization to unseen tools across long-horizon reasoning tasks.

AIBullisharXiv – CS AI · May 117/10

🧠

From Storage to Experience: A Survey on the Evolution of LLM Agent Memory Mechanisms

Researchers propose a unified evolutionary framework for LLM agent memory systems, categorizing development into three stages: Storage, Reflection, and Experience. The framework addresses fragmented research by synthesizing engineering and cognitive science perspectives, offering design principles for building more capable autonomous AI agents.

AIBullisharXiv – CS AI · May 117/10

🧠

Unlocking High-Fidelity Molecular Generation from Mass Spectra via Dual-Stream Line Graph Diffusion

Researchers introduce DualLGD, a novel dual-stream diffusion architecture for generating molecular structures from mass spectra data. The method achieves 3x improvement over previous state-of-the-art by separating atom-level and bond-level reasoning into dedicated computation streams, addressing a fundamental circular dependency problem in molecular generation.

AIBullisharXiv – CS AI · May 77/10

🧠

A large language model-type architecture for high-dimensional molecular potential energy surfaces

Researchers have developed a neural network architecture inspired by large language models to predict high-dimensional molecular potential energy surfaces, successfully computing accurate predictions for a 186-dimensional system representing a protonated 21-water cluster—a significant advance in computational chemistry that could accelerate reaction rate predictions.

AINeutralarXiv – CS AI · May 17/10

🧠

When Agents Evolve, Institutions Follow

Researchers from arXiv demonstrate that multi-agent AI systems built on large language models achieve dramatically different performance levels based on their organizational structure, with governance topology showing a 57+ percentage point performance gap. The study translates seven historical political institutions into executable multi-agent architectures, revealing that optimal organizational design shifts systematically with model capability and task requirements.

AIBullisharXiv – CS AI · Apr 147/10

🧠

Escaping the Context Bottleneck: Active Context Curation for LLM Agents via Reinforcement Learning

Researchers introduce ContextCurator, a reinforcement learning-based framework that decouples context management from task execution in LLM agents, addressing the context bottleneck problem. The approach pairs a lightweight specialized policy model with a frozen foundation model, achieving significant improvements in success rates and token efficiency across benchmark tasks.

🧠 GPT-4🧠 Gemini

AIBearisharXiv – CS AI · Apr 137/10

🧠

Robust Reasoning Benchmark

Researchers have developed a 14-technique perturbation pipeline to test the robustness of large language models' reasoning capabilities on mathematical problems. Testing reveals that while frontier models maintain resilience, open-weight models experience catastrophic accuracy collapses up to 55%, and all tested models degrade when solving sequential problems in a single context window, suggesting fundamental architectural limitations in current reasoning systems.

🧠 Claude🧠 Opus

AIBullisharXiv – CS AI · Apr 107/10

🧠

Computer Environments Elicit General Agentic Intelligence in LLMs

Researchers introduce LLM-in-Sandbox, a minimal computer environment that significantly enhances large language models' capabilities across diverse tasks without additional training. The approach enables weaker models to internalize agent-like behaviors through specialized training, demonstrating that environmental interaction—not just model parameters—drives general intelligence in LLMs.

AIBullisharXiv – CS AI · Apr 77/10

🧠

Springdrift: An Auditable Persistent Runtime for LLM Agents with Case-Based Memory, Normative Safety, and Ambient Self-Perception

Researchers have developed Springdrift, a persistent runtime system for long-lived AI agents that maintains memory across sessions and provides auditable decision-making capabilities. The system was successfully deployed for 23 days, during which the AI agent autonomously diagnosed infrastructure problems and maintained context across multiple communication channels without explicit instructions.

AINeutralarXiv – CS AI · Apr 77/10

🧠

The Topology of Multimodal Fusion: Why Current Architectures Fail at Creative Cognition

Researchers identify a fundamental topological limitation in current multimodal AI architectures like CLIP and GPT-4V, proposing that their 'contact topology' structure prevents creative cognition. The paper introduces a philosophical framework combining Chinese epistemology with neuroscience to propose new architectures using Neural ODEs and topological regularization.

🧠 Gemini

AIBullisharXiv – CS AI · Mar 267/10

🧠

Bottlenecked Transformers: Periodic KV Cache Consolidation for Generalised Reasoning

Researchers introduce Bottlenecked Transformers, a new architecture that improves AI reasoning by up to 6.6 percentage points through periodic memory consolidation inspired by brain processes. The system uses a Cache Processor to rewrite key-value cache entries at reasoning step boundaries, achieving better performance on math reasoning benchmarks compared to standard Transformers.

AINeutralarXiv – CS AI · Mar 267/10

🧠

A Theory of LLM Information Susceptibility

Researchers propose a theory of LLM information susceptibility that identifies fundamental limits to how large language models can improve optimization in AI agent systems. The study shows that nested, co-scaling architectures may be necessary for open-ended AI self-improvement, providing predictive constraints for AI system design.

AIBullisharXiv – CS AI · Mar 177/10

🧠

EARCP: Self-Regulating Coherence-Aware Ensemble Architecture for Sequential Decision Making -- Ensemble Auto-Regule par Coherence et Performance

Researchers introduce EARCP, a new ensemble architecture for AI that dynamically weights different expert models based on performance and coherence. The system provides theoretical guarantees with sublinear regret bounds and has been tested on time series forecasting, activity recognition, and financial prediction tasks.

AIBullisharXiv – CS AI · Mar 177/10

🧠

Learning to Forget: Sleep-Inspired Memory Consolidation for Resolving Proactive Interference in Large Language Models

Researchers developed SleepGate, a biologically-inspired framework that significantly improves large language model memory by mimicking sleep-based consolidation to resolve proactive interference. The system achieved 99.5% retrieval accuracy compared to less than 18% for existing methods in experimental testing.

AIBullisharXiv – CS AI · Mar 167/10

🧠

Revisiting Model Stitching In the Foundation Model Era

Researchers introduce improved methods for stitching Vision Foundation Models (VFMs) like CLIP and DINOv2, enabling integration of different models' strengths. The study proposes VFM Stitch Tree (VST) technique that allows controllable accuracy-latency trade-offs for multimodal applications.

AINeutralarXiv – CS AI · Mar 127/10

🧠

Multi-Agent Memory from a Computer Architecture Perspective: Visions and Challenges Ahead

Researchers propose treating multi-agent AI memory as a computer architecture problem, introducing a three-layer memory hierarchy and identifying critical protocol gaps. The paper highlights multi-agent memory consistency as the most pressing challenge for building scalable collaborative AI systems.

AINeutralarXiv – CS AI · Mar 57/10

🧠

Synthetic emotions and consciousness: exploring architectural boundaries

Researchers propose an architectural framework for implementing emotion-like AI systems while deliberately avoiding features associated with consciousness. The study introduces risk-reduction constraints and engineering principles to create sophisticated emotional AI without triggering consciousness-related safety concerns.

AIBullisharXiv – CS AI · Mar 46/104

🧠

REGAL: A Registry-Driven Architecture for Deterministic Grounding of Agentic AI in Enterprise Telemetry

Researchers present REGAL, a registry-driven architecture that enables AI agents to work deterministically with enterprise telemetry data from systems like CI/CD pipelines and observability platforms. The system addresses key challenges of grounding Large Language Models on private enterprise data through structured data processing and version-controlled action spaces.

AIBullisharXiv – CS AI · Mar 37/104

🧠

Relational Transformer: Toward Zero-Shot Foundation Models for Relational Data

Researchers from Stanford introduce the Relational Transformer (RT), a new AI architecture that can work with relational databases without task-specific fine-tuning. The 22M parameter model achieves 93% performance of fully supervised models on binary classification tasks, significantly outperforming a 27B parameter LLM at 84%.

Page 1 of 3Next →