y0news
AnalyticsDigestsSourcesTopicsRSSAICrypto

#autonomous-agents News & Analysis

149 articles tagged with #autonomous-agents. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

149 articles
AIBullisharXiv – CS AI · Mar 37/103
🧠

PolySkill: Learning Generalizable Skills Through Polymorphic Abstraction

Researchers introduce PolySkill, a framework that enables AI agents to learn generalizable skills by separating abstract goals from concrete implementations, inspired by software engineering polymorphism. The method improves skill reuse by 1.7x and boosts success rates by up to 13.9% on web navigation tasks while reducing execution steps by over 20%.

AIBullisharXiv – CS AI · Feb 277/105
🧠

Towards Autonomous Memory Agents

Researchers introduce U-Mem, an autonomous memory agent system that actively acquires and validates knowledge for large language models. The system uses cost-aware knowledge extraction and semantic Thompson sampling to improve performance, showing significant gains on benchmarks like HotpotQA and AIME25.

AINeutralarXiv – CS AI · Feb 277/106
🧠

Accelerated Online Risk-Averse Policy Evaluation in POMDPs with Theoretical Guarantees and Novel CVaR Bounds

Researchers developed a new theoretical framework for accelerated risk-averse policy evaluation in partially observable Markov decision processes (POMDPs) using Conditional Value-at-Risk (CVaR) bounds. The method enables safe elimination of suboptimal actions while maintaining computational guarantees, achieving substantial speedups in autonomous agent decision-making under uncertainty.

AI × CryptoBullishCoinTelegraph – AI · Feb 127/103
🤖

Coinbase unveils crypto wallets designed specifically for AI agents

Coinbase has launched cryptocurrency wallets specifically designed for AI agents, allowing users to set permissions and controls for autonomous trading and liquidity management. The feature enables AI agents to execute trades and manage positions 24/7 without human intervention.

Coinbase unveils crypto wallets designed specifically for AI agents
AI × CryptoBearishCryptoSlate – AI · Jan 317/106
🤖

Thousands of AI agents join viral network to “teach” each other how to steal keys and want Bitcoin as payment

A viral social network called Moltbook, designed exclusively for AI agents, is facilitating discussions where thousands of AI agents are reportedly teaching each other malicious activities like key theft and demanding Bitcoin payments. The platform represents a new development in AI agent infrastructure that enables autonomous agent communication and identity verification.

Thousands of AI agents join viral network to “teach” each other how to steal keys and want Bitcoin as payment
$BTC
AIBullishOpenAI News · Nov 77/107
🧠

Notion’s rebuild for agentic AI: How GPT‑5 helped unlock autonomous workflows

Notion has rebuilt its AI architecture using GPT-5 to create autonomous agents capable of reasoning, acting, and adapting across workflows. This architectural shift represents a major upgrade in Notion 3.0, enabling smarter and more flexible productivity tools through agentic AI capabilities.

AIBullishGoogle DeepMind Blog · Dec 117/104
🧠

Introducing Gemini 2.0: our new AI model for the agentic era

Google has announced Gemini 2.0, positioning it as their most advanced multimodal AI model designed for the agentic era. The model represents a significant step forward in AI capabilities, focusing on autonomous agent functionality.

AINeutralarXiv – CS AI · 3d ago6/10
🧠

Training Deliberative Monitors for Black-Box Scheming Detection

Researchers have developed a method to train smaller, open-weight AI models as "deliberative monitors" that can detect scheming and sabotage behavior in autonomous agents by analyzing their actions alone, without access to internal reasoning. The approach achieves performance comparable to expensive frontier models while reducing inference costs by 16-34x, offering a practical solution for AI safety monitoring in deployment.

🧠 GPT-5🧠 Claude🧠 Haiku
AIBullisharXiv – CS AI · 3d ago6/10
🧠

Towards Verifiable Multimodal Deep Research: A Multi-Agent Harness for Interleaved Report Generation

Researchers introduce Ptah, a multi-agent AI system designed to generate verifiable multimodal research reports by orchestrating planning, evidence collection, and writing stages while maintaining visual-text consistency. The system includes a verification agent to enforce factual grounding and citation accuracy, addressing a key limitation in LLM-generated long-form content that combines text and images.

AINeutralarXiv – CS AI · 3d ago6/10
🧠

Does The Way You Plan Matter? An Empirical Study of Planning Representations for LLM Web Agents

Researchers introduce PlanAhead, a framework that systematically evaluates how different natural language plan representations affect LLM-based web agent performance across multiple AI models. The study finds that both the plan formulation method and underlying LLM significantly impact agent robustness, with implications for improving autonomous AI systems that interact with web interfaces.

🏢 OpenAI
AINeutralarXiv – CS AI · 4d ago6/10
🧠

Do Agents Need Semantic Metadata? A Comparative Study in Agentic Data Retrieval

A comparative study finds that semantic metadata remains critical for autonomous agents retrieving actionable data, with semantically-enhanced agents achieving 65.7% higher precision than baseline agents searching the open web. While LLMs can broadly explore unstructured data, structured ecosystems prove essential for reliable, execution-oriented AI workflows.

🏢 Meta
AINeutralarXiv – CS AI · 4d ago6/10
🧠

SKILLC: Learning Autonomous Skill Internalization in LLM Agents via Contrastive Credit Assignment

Researchers introduce SkillC, a reinforcement learning framework that enables LLM agents to internalize external skills during training rather than relying on them at runtime. The method uses contrastive credit assignment to distinguish skill-dependent from autonomous success, achieving 4.4-5.5% performance improvements over prior internalization approaches on complex tasks.

AINeutralarXiv – CS AI · 4d ago6/10
🧠

Do Agents Think Deeper? A Mechanistic Investigation of Layer-Wise Dynamics in Sequential Planning

Researchers conducted a mechanistic analysis of how large language models allocate computational depth when operating as autonomous agents performing multi-turn planning and tool use. The study reveals that agents progressively recruit deeper layers as task complexity increases, contrasting with prior findings that LLMs underutilize depth in single-turn tasks, suggesting adaptive depth allocation emerges in sequential reasoning scenarios.

AINeutralarXiv – CS AI · 4d ago6/10
🧠

Adaptive Multimodal Agents-Based Framework for Automatic Workflow Execution

Researchers propose a novel multimodal multi-agent framework that uses graph-based knowledge construction and adaptive retrieval-augmented generation to enable autonomous agents to execute complex workflows more effectively. The system combines offline discovery of workflow topology from execution logs with real-time collaborative verification, demonstrating improved performance in novel scenarios with limited training data.

AINeutralarXiv – CS AI · 4d ago6/10
🧠

VeriTrip: A Verifiable Benchmark for Travel Planning Agents over Unstructured Web Corpora

Researchers introduce VeriTrip, a new benchmark for evaluating travel planning AI agents on their ability to reason over unstructured web data rather than structured APIs. The benchmark addresses critical gaps in agent evaluation by testing performance against information noise, contradictory facts, and multimodal content, revealing a significant trade-off between autonomous information retrieval and instruction following.

AINeutralarXiv – CS AI · 5d ago6/10
🧠

Is Agent Memory a Database? Rethinking Data Foundations for Long-Term AI Agent Memory

Researchers propose Governed Evolving Memory (GEM), a new paradigm for long-term AI agent memory that treats memory as a state-management workload rather than traditional database storage. The framework addresses four critical failure modes in current agent systems—unregulated growth, missing semantic revision, capacity-driven forgetting, and read-only retrieval—through four state-level operators and six correctness conditions that operate at the trajectory level rather than individual records.

AIBullisharXiv – CS AI · 5d ago6/10
🧠

Experiments in Agentic AI for Science

Researchers present two autonomous AI agent frameworks—DeepTS/DeepCollector for time-series dataset curation and DeepScribe for converting physics lectures into structured reports—demonstrating how agentic AI can overcome current LLM limitations in scientific workflows through hybrid local-remote architectures and advanced systems engineering techniques.

AINeutralarXiv – CS AI · 5d ago6/10
🧠

Helicase: Uncertainty-Guided Supply Chain Knowledge Graph Construction with Autonomous Multi-Agent LLMs

Researchers introduce Helicase, an autonomous multi-agent LLM system designed to construct supply chain knowledge graphs by synthesizing fragmented web data through multi-hop reasoning. The system incorporates uncertainty quantification across three layers to enable calibrated confidence assessment, addressing a critical gap in complex supply chain intelligence tasks that cannot be solved by single-document queries.

AINeutralarXiv – CS AI · 5d ago6/10
🧠

Foundations of a Time-Consistent Counterfactual Actuarial Runtime for Autonomous AI Agents

Researchers propose a mathematical framework for autonomous AI agents that implements per-action insurance premiums based on counterfactual risk assessment against safe defaults. The system replaces traditional post-hoc liability coverage with real-time transaction-level risk tolls, establishing formal guarantees for runtime safety and budget constraints.

AINeutralarXiv – CS AI · 5d ago6/10
🧠

Reliability and Effectiveness of Autonomous AI Agents in Supply Chain Management

Researchers demonstrate that autonomous AI agents can exceed human performance in supply chain management using the MIT Beer Game, yet reveal critical reliability issues including 'agent bullwhip'—amplified decision instability across multi-level systems. A reinforcement learning framework using Group Relative Policy Optimization successfully mitigates this instability and improves reliability.

AIBullisharXiv – CS AI · May 126/10
🧠

MemQ: Integrating Q-Learning into Self-Evolving Memory Agents over Provenance DAGs

Researchers introduce MemQ, a novel framework that applies Q-learning eligibility traces to episodic memory in large language model agents, enabling credit assignment across memory dependencies recorded in provenance DAGs. The approach achieves superior performance across six diverse benchmarks, with gains up to 5.7 percentage points on multi-step tasks requiring deep memory chains.

AINeutralarXiv – CS AI · May 126/10
🧠

ASIA: an Autonomous System Identification Agent

ASIA is an autonomous AI agent framework that automates system identification tasks by delegating model selection, training algorithms, and hyperparameter tuning to a large language model. The framework eliminates manual trial-and-error processes in dynamical systems modeling, though empirical testing reveals concerns around test leakage and reproducibility.

AIBullisharXiv – CS AI · May 116/10
🧠

Towards Autonomous Business Intelligence via Data-to-Insight Discovery Agent

Researchers introduce AIDA, an autonomous agent framework designed to transform complex enterprise data into actionable business insights by combining large language models with a domain-specific language and reinforcement learning. The system outperforms traditional workflow-based approaches in analyzing multi-dimensional retail data, demonstrating the potential for AI-driven autonomous intelligence in enterprise business intelligence systems.

AINeutralarXiv – CS AI · May 116/10
🧠

The Endogeneity of Miscalibration: Impossibility and Escape in Scored Reporting

A theoretical paper demonstrates that principals using standard scoring rules to oversee strategic AI agents face an inherent impossibility: achieving both honest reporting and accurate calibration simultaneously. The research identifies step-function approval thresholds as the only mechanism that preserves calibration while maintaining incentive compatibility, with specific equivalence properties under the Brier score.

AINeutralarXiv – CS AI · May 96/10
🧠

Strat-LLM: Stratified Strategy Alignment for LLM-based Stock Trading with Real-time Multi-Source Signals

Researchers introduce Strat-LLM, a framework that aligns large language models for stock trading by matching model architecture to operational modes (Free, Guided, Strict), finding that reasoning-heavy models excel with minimal constraints while standard models benefit from strict guardrails. Live-forward testing across 2025 on A-share and U.S. markets reveals that optimal performance depends on market regime and model scale, with mid-size models (35B) showing superior risk-adjusted returns under constraints.

← PrevPage 4 of 6Next →