y0news
AnalyticsDigestsSourcesTopicsRSSAICrypto

#agentic-systems News & Analysis

34 articles tagged with #agentic-systems. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

34 articles
AINeutralarXiv – CS AI · May 46/10
🧠

PORTool: Importance-Aware Policy Optimization with Rewarded Tree for Multi-Tool-Integrated Reasoning

PORTool is a new policy-optimization algorithm that improves how AI agents learn to use external tools by solving the credit-assignment problem in multi-step reasoning tasks. The method uses a rewarded tree structure to assign rewards at individual steps rather than only at outcomes, enabling agents to achieve higher accuracy while reducing unnecessary tool calls.

AINeutralarXiv – CS AI · May 16/10
🧠

Rethinking Agentic Reinforcement Learning In Large Language Models

A new research paper examines the shift from traditional reinforcement learning toward agentic AI systems powered by large language models, where AI agents can autonomously set goals, plan long-term strategies, and adapt dynamically in complex environments. This paradigm moves beyond static, episodic training to incorporate cognitive capabilities like meta-reasoning and self-reflection, representing a fundamental evolution in how RL systems are designed and deployed.

AINeutralarXiv – CS AI · Apr 206/10
🧠

The Semi-Executable Stack: Agentic Software Engineering and the Expanding Scope of SE

A research paper proposes that AI-driven software engineering doesn't threaten the field but rather expands its scope to include 'semi-executable' artifacts—combinations of natural language, tools, and workflows requiring human or probabilistic interpretation. The Semi-Executable Stack model provides a diagnostic framework across six layers to understand how software engineering practices evolve as AI agents handle routine tasks.

AINeutralarXiv – CS AI · Apr 146/10
🧠

ATANT v1.1: Positioning Continuity Evaluation Against Memory, Long-Context, and Agentic-Memory Benchmarks

ATANT v1.1 is a companion paper clarifying how existing memory and context evaluation benchmarks (LOCOMO, LongMemEval, BEAM, MemoryBench, and others) fail to measure 'continuity' as defined in the original v1.0 framework. The analysis reveals that existing benchmarks cover a median of only 1 out of 7 required continuity properties, and the authors demonstrate a significant measurement gap through comparative scoring: their system achieves 96% on ATANT but only 8.8% on LOCOMO, proving these benchmarks evaluate different capabilities.

AINeutralarXiv – CS AI · Apr 136/10
🧠

Model Space Reasoning as Search in Feedback Space for Planning Domain Generation

Researchers present a novel approach using agentic language model feedback frameworks to generate planning domains from natural language descriptions augmented with symbolic information. The method employs heuristic search over model space optimized by various feedback mechanisms, including landmarks and plan validator outputs, to improve domain quality for practical deployment.

AINeutralarXiv – CS AI · Apr 136/10
🧠

Litmus (Re)Agent: A Benchmark and Agentic System for Predictive Evaluation of Multilingual Models

Researchers introduce Litmus (Re)Agent, an agentic system that predicts how multilingual AI models will perform on tasks lacking direct benchmark data. Using a controlled benchmark of 1,500 questions across six tasks, the system decomposes queries into hypotheses and synthesizes predictions through structured reasoning, outperforming competing approaches particularly when direct evidence is sparse.

AINeutralarXiv – CS AI · Mar 37/108
🧠

PhotoBench: Beyond Visual Matching Towards Personalized Intent-Driven Photo Retrieval

Researchers introduce PhotoBench, the first benchmark for personalized photo retrieval using authentic personal albums rather than web images. The study reveals critical limitations in current AI systems, including modality gaps in unified embedding models and poor tool orchestration in agentic systems.

AIBullisharXiv – CS AI · Mar 26/1015
🧠

Robust and Efficient Tool Orchestration via Layered Execution Structures with Reflective Correction

Researchers propose a new approach to tool orchestration in AI agent systems using layered execution structures with reflective error correction. The method reduces execution complexity by using coarse-grained layer structures for global guidance while handling failures locally, eliminating the need for precise dependency graphs or fine-grained planning.

← PrevPage 2 of 2