#system-architecture News & Analysis

15 articles tagged with #system-architecture. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

15 articles

AIBullisharXiv – CS AI · Jun 237/10

🧠

AOHP: An Open-Source OS-Level Agent Harness for Personalized, Efficient and Secure Interaction

Researchers introduce AOHP, an open-source OS-level agent harness built on Android that treats AI agents as first-class operating system actors. The framework addresses architectural gaps in current systems by enabling personalized service composition, efficient agent interfaces, and secure information flow, demonstrating significant improvements in task completion rates, execution costs, and security compliance.

AIBearisharXiv – CS AI · Jun 47/10

🧠

What If Prompt Injection Never Left? Exploring Cross-Session Stored Prompt Injection in Agentic Systems

Researchers have identified a critical security vulnerability in agentic AI systems called cross-session stored prompt injection, where malicious instructions can persist within system state and compromise future interactions long after the attacker disconnects. This threat fundamentally differs from traditional prompt injection by leveraging long-lived system artifacts like memories and filesystems, transforming ephemeral model-level attacks into durable system-level vulnerabilities that accumulate over time.

AINeutralarXiv – CS AI · Jun 27/10

🧠

Monitoring Agentic Systems Before They're Reliable

Researchers present a monitoring methodology for agentic AI systems still in early production stages, where structural integration defects rather than task-level errors cause most failures. The approach uses variance-based characterization across three monitoring scopes to identify and triage issues, finding that task-level error detection is often masked by underlying system architecture problems.

AINeutralarXiv – CS AI · May 127/10

🧠

AgentCollabBench: Diagnosing When Good Agents Make Bad Collaborators

Researchers introduced AgentCollabBench, a diagnostic benchmark revealing critical vulnerabilities in multi-agent AI systems where constraints silently fail during peer collaboration. The study demonstrates that communication topology—not model capability alone—determines whether safeguards survive information handoffs between agents, exposing structural weaknesses invisible to standard outcome-based evaluation.

🧠 GPT-4🧠 Gemini🧠 Llama

AIBullisharXiv – CS AI · Apr 107/10

🧠

AI-Driven Research for Databases

Researchers propose AI-Driven Research for Systems (ADRS), a framework using large language models to automate database optimization by generating and evaluating hundreds of candidate solutions. By co-evolving evaluators with solutions, the team demonstrates discovery of novel algorithms achieving up to 6.8x latency improvements over existing baselines in buffer management, query rewriting, and index selection tasks.

AINeutralarXiv – CS AI · Jun 236/10

🧠

Confidence Laundering in Agent Systems: Why Uncertainty Needs a Latent Carrier

Researchers identify 'confidence laundering' as a critical failure mode in multi-component agent systems where upstream uncertainty gets masked by downstream components, leading to error amplification. They propose 'latent uncertainty' as a solution to preserve decision fragility across component interfaces rather than treating intermediate outputs as procedurally valid artifacts.

AINeutralarXiv – CS AI · Jun 236/10

🧠

Human-Less LLM Serving: Quantifying the Human Tax on Throughput

Researchers quantify a significant efficiency cost in LLM serving systems: meeting latency targets (TTFT and TPOT) designed for human users reduces throughput by 60-93% for AI workloads that don't require human-perceptible latency. The study demonstrates that one-size-fits-all SLA configurations waste substantial computational resources when applied to programmatic AI-to-AI tasks.

AINeutralarXiv – CS AI · Jun 196/10

🧠

Grounded Inference: Principles for Deterministically Encapsulated Generative Models

Researchers propose a foundational framework for safely integrating generative AI models into traditional computational systems through four architectural primitives that enable deterministic encapsulation of probabilistic models. The work addresses critical risks early adopters have faced and identifies two common anti-patterns to help engineers avoid costly mistakes when deploying AI systems.

AINeutralarXiv – CS AI · Jun 86/10

🧠

Twelve quick tips for designing AI-driven HPC workflows

This technical guide presents twelve practical recommendations for designing AI-driven high-performance computing (HPC) workflows that balance the iterative, probabilistic nature of modern AI with traditional HPC infrastructure. The article addresses critical system-level challenges including containerization, resource management, and I/O optimization, providing researchers with a framework to transition from rigid computational pipelines to adaptive, intelligent environments.

AINeutralarXiv – CS AI · Jun 16/10

🧠

Diagnosing Failure Modes of Shared-State Collaboration in Resource-Constrained Visual Agents

Researchers introduce CoSee, an auditing framework for analyzing failure modes in collaborative visual reasoning systems using resource-constrained language models (4B-8B parameters). The study reveals that shared working memory architectures paradoxically amplify hallucinations rather than improve performance, identifying two critical failure modes: noise reinforcement and policy collapse.

AINeutralarXiv – CS AI · May 276/10

🧠

MemFail: Stress-Testing Failure Modes of LLM Memory Systems

Researchers introduce MemFail, a diagnostic benchmark for testing failure modes in LLM memory systems by isolating three core operations: summarization, storage, and retrieval. The benchmark evaluates state-of-the-art memory systems across five adversarially-designed datasets to empirically understand architectural tradeoffs, moving beyond aggregate accuracy metrics.

AINeutralarXiv – CS AI · Apr 146/10

🧠

VeriTrans: Fine-Tuned LLM-Assisted NL-to-PL Translation via a Deterministic Neuro-Symbolic Pipeline

VeriTrans is a machine learning system that converts natural language requirements into formal logic suitable for automated solvers, using a validator-gated pipeline to ensure reliability. Achieving 94.46% correctness on 2,100 specifications, the system combines fine-tuned language models with round-trip verification and deterministic execution, enabling auditable translation for critical applications.

$PL$NL$CNF

AIBullisharXiv – CS AI · Apr 146/10

🧠

AEG: A Baremetal Framework for AI Acceleration via Direct Hardware Access in Heterogeneous Accelerators

Researchers introduce AEG, a bare-metal runtime framework that enables high-performance machine learning inference on heterogeneous AI accelerators without OS overhead. The system achieves 9.2× higher compute efficiency and uses 11× fewer hardware tiles than Linux-based alternatives, demonstrating significant potential for edge AI deployment optimization.

AIBullisharXiv – CS AI · Mar 176/10

🧠

Beyond Local Code Optimization: Multi-Agent Reasoning for Software System Optimization

Researchers introduced a multi-agent AI framework for whole-system software optimization that goes beyond local code improvements to analyze entire microservice architectures. The system uses coordinated agents for summarization, analysis, optimization, and verification, achieving 36.58% throughput improvement and 27.81% response time reduction in proof-of-concept testing.

AIBullisharXiv – CS AI · Mar 37/107

🧠

QuickGrasp: Responsive Video-Language Querying Service via Accelerated Tokenization and Edge-Augmented Inference

Researchers propose QuickGrasp, a video-language querying system that combines local processing with edge computing to achieve both fast response times and high accuracy. The system achieves up to 12.8x reduction in response delay while maintaining the accuracy of large video-language models through accelerated tokenization and adaptive edge augmentation.