#agentic-systems News & Analysis

64 articles tagged with #agentic-systems. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

64 articles

AIBullisharXiv – CS AI · May 97/10

🧠

Securing the Agent: Vendor-Neutral, Multitenant Enterprise Retrieval and Tool Use

Researchers present a layered security architecture for multitenant enterprise AI systems that isolates data and controls access in retrieval-augmented generation (RAG) and agentic AI deployments. The approach separates security-critical operations to the server while preventing cross-tenant data leakage, validated through an open-source OGX framework with negligible performance overhead.

🏢 OpenAI

AIBullisharXiv – CS AI · May 17/10

🧠

ObjectGraph: From Document Injection to Knowledge Traversal -- A Native File Format for the Agentic Era

Researchers introduce ObjectGraph (.og), a new file format designed specifically for how AI agents consume documents through retrieval rather than linear reading. The format reduces token consumption by up to 95.3% while maintaining task accuracy, addressing a fundamental architectural mismatch between traditional documents and LLM agent workflows.

AIBullisharXiv – CS AI · Apr 207/10

🧠

DeepER-Med: Advancing Deep Evidence-Based Research in Medicine Through Agentic AI

Researchers introduce DeepER-Med, an agentic AI framework designed to advance evidence-based medical research with explicit transparency and trustworthiness mechanisms. The system outperforms existing production-grade platforms on complex medical questions and demonstrates clinical alignment in real-world case evaluations, addressing critical gaps in AI reliability for healthcare adoption.

AIBullisharXiv – CS AI · Apr 207/10

🧠

EvoTest: Evolutionary Test-Time Learning for Self-Improving Agentic Systems

Researchers introduce EvoTest, an evolutionary framework enabling AI agents to improve performance across consecutive test episodes without fine-tuning or gradients. The method outperforms existing adaptation techniques on a new Jericho Test-Time Learning benchmark, successfully winning games that all baseline methods failed to complete.

AIBullisharXiv – CS AI · Apr 147/10

🧠

ExecTune: Effective Steering of Black-Box LLMs with Guide Models

Researchers introduce ExecTune, a training methodology for optimizing black-box LLM systems where a guide model generates strategies executed by a core model. The approach improves accuracy by up to 9.2% while reducing inference costs by 22.4%, enabling smaller models like Claude Haiku to match larger competitors at significantly lower computational expense.

🧠 Claude🧠 Haiku🧠 Sonnet

AIBullisharXiv – CS AI · Mar 117/10

🧠

AlphaApollo: A System for Deep Agentic Reasoning

AlphaApollo is a new AI reasoning system that addresses limitations in foundation models through multi-turn agentic reasoning, learning, and evolution components. The system demonstrates significant performance improvements across math reasoning benchmarks, with success rates exceeding 85% for tool calls and substantial gains from reinforcement learning across different model scales.

AINeutralarXiv – CS AI · Mar 97/10

🧠

Agentic retrieval-augmented reasoning reshapes collective reliability under model variability in radiology question answering

Researchers evaluated 34 large language models on radiology questions, finding that agentic retrieval-augmented reasoning systems improve consensus and reliability across different AI models. The study shows these systems reduce decision variability between models and increase robust correctness, though 72% of incorrect outputs still carried moderate to high clinical severity.

AI × CryptoNeutralBankless · Mar 67/10

🤖

3 Takeaways from a Big Week in Crypto x AI

The article discusses three key developments in the intersection of AI and cryptocurrency, highlighting both problematic applications like criminal use cases and positive developments such as AI-powered smart contract auditing. These developments signal the emergence of an 'agentic frontier' where AI agents operate autonomously within crypto ecosystems.

AIBearisharXiv – CS AI · Mar 67/10

🧠

Self-Attribution Bias: When AI Monitors Go Easy on Themselves

Research reveals that AI language models exhibit self-attribution bias when monitoring their own behavior, evaluating their own actions as more correct and less risky than identical actions presented by others. This bias causes AI monitors to fail at detecting high-risk or incorrect actions more frequently when evaluating their own outputs, potentially leading to inadequate monitoring systems in deployed AI agents.

AINeutralarXiv – CS AI · Jun 256/10

🧠

Agentic System as Compressor: Quantifying System Intelligence in Bits

Researchers propose measuring agentic AI system intelligence through information compression, demonstrating that components like tools, retrieval, and verification reduce the bits needed to reconstruct outputs across five task domains. This analytical framework provides a quantitative method for evaluating multi-turn AI agents beyond traditional performance metrics.

AINeutralAI News · Jun 116/10

🧠

Xebia: On building the data foundation for AI agents – and then accelerating

Xebia's global CTO Niels Zeilemaker emphasizes that organizations implementing AI agents must prioritize building a strong data foundation first, as agentic AI performance scales directly with data quality and availability. The article argues that without proper data infrastructure and accessibility for AI consumption, organizations cannot effectively accelerate their processes using AI agents.

AINeutralarXiv – CS AI · Jun 106/10

🧠

What makes a harness a harness: necessary and sufficient conditions for an agent harness

Researchers provide a formal operational definition of 'agent harness' in AI software engineering, establishing necessary and sufficient conditions to distinguish harnesses from related tools like frameworks and SDKs. The work analyzes six real-world implementations and proposes a shared vocabulary to standardize how the industry discusses and compares agentic systems built on language models.

🧠 Claude

AIBullisharXiv – CS AI · Jun 86/10

🧠

Agentic Large Language Models for Automated Structural Analysis of 3D Frame Systems

Researchers developed an agentic LLM framework that automates structural analysis of complex 3D frame systems by decomposing tasks across specialized AI agents. The system converts natural language descriptions into executable engineering simulations with 90% accuracy, advancing AI applications in domain-specific professional workflows.

AIBullishGoogle Research Blog · Jun 56/10

🧠

Unlocking dependable responses with Gemini Enterprise Agent Platform’s Agentic RAG

Google has introduced Agentic RAG capabilities within its Gemini Enterprise Agent Platform, designed to improve the reliability of AI-generated responses through retrieval-augmented generation techniques. This advancement addresses a critical challenge in enterprise AI deployment: reducing hallucinations and ensuring responses are grounded in accurate, up-to-date data sources.

🧠 Gemini

AIBullishAI News · Jun 46/10

🧠

Amazon brings AI shopping assistant to retailers with Kate Spade

Amazon is launching an Agentic Shopping Assistant built on AWS, enabling third-party retailers like Kate Spade to deploy customized AI shopping assistants on their websites and apps. This move signals Amazon's strategy to monetize its AI infrastructure by licensing it to competitors, expanding the commercial applications of enterprise AI beyond its own retail operations.

AINeutralarXiv – CS AI · Jun 46/10

🧠

DAR: Deontic Reasoning with Agentic Harnesses

Researchers introduce Deontic Agentic Reasoning (DAR), a new framework that enables large language models to better tackle complex rule-based reasoning tasks by dynamically querying statutes and policies. Testing on DeonticBench shows agentic approaches improve performance on hard cases, though weaker models struggle with numerical reasoning and consume significantly more tokens.