#ai-agents News & Analysis
Coverage of #ai-agents has generated 98 articles over the past month, with 61.2% maintaining a bullish sentiment. Discussion remains stable compared to the previous quarter, reflecting consistent interest rather than sudden shifts in outlook. The conversation centers on major AI models including GPT-5 and Claude, with substantial research contributions tracked through arXiv's computer science and AI channels alongside cryptocurrency-focused outlets.
The topic frequently intersects with machine learning, large language models, and automation research, while also appearing alongside discussions of blockchain assets like Ethereum and Bitcoin. Scan the articles below to explore how #ai-agents are being developed, deployed, and analyzed across technical and financial perspectives.
sentiment · last 30d (98 articles)Top sources:arXiv – CS AI · 243Crypto Briefing · 19CoinDesk · 18Fortune Crypto · 12TechCrunch – AI · 12
Most-discussed entities:GPT-5 · 13Claude · 13Anthropic · 10OpenAI · 9Opus · 6
AINeutralarXiv – CS AI · Mar 66/10
🧠Researchers introduced FinRetrieval, a benchmark testing AI agents' ability to retrieve financial data, evaluating 14 configurations across major providers. The study found that tool availability dramatically impacts performance, with Claude Opus achieving 90.8% accuracy using structured APIs versus only 19.8% with web search alone.
🏢 OpenAI🏢 Anthropic🧠 Claude
AIBullishTechCrunch – AI · Mar 56/10
🧠AWS has launched Amazon Connect Health, a new AI agent platform designed specifically for healthcare applications. The platform focuses on automating key healthcare processes including patient scheduling, documentation, and patient verification tasks.
AIBullishTechCrunch – AI · Mar 56/10
🧠Cursor is launching Automations, a new agentic coding tool that automatically deploys AI agents within development environments. The system can be triggered by codebase changes, Slack messages, or timers to enhance automated development workflows.
AIBullishFortune Crypto · Mar 46/101
🧠OpenAI's Codex coding tool has reached 1 million users, with the company positioning it as a gateway for businesses to adopt AI agents. The milestone announcement has been overshadowed by controversy surrounding OpenAI's agreement to provide AI services to the Pentagon.
AI × CryptoBullishCoinJournal · Mar 47/102
🤖Byreal launched its first AI agent skillset for Solana DEX, featuring an open-source CLI that enables autonomous trading and liquidity farming. The Copy Farmer tool automatically replicates top LP strategies with risk preview, while agent skills include pool analysis, swaps, and CLMM management.
$SOL
AINeutralarXiv – CS AI · Mar 45/103
🧠Researchers developed V-GEMS, a new multimodal AI agent architecture that improves web navigation by combining visual grounding with explicit memory systems. The system achieved a 28.7% performance improvement over existing baselines by preventing navigation loops and enabling better backtracking through structured path mapping.
AIBullisharXiv – CS AI · Mar 45/102
🧠Researchers introduce MultiSessionCollab, a benchmark for evaluating conversational AI agents' ability to learn and adapt to user preferences across multiple collaboration sessions. The study demonstrates that equipping agents with persistent memory significantly improves long-term collaboration quality, task success rates, and user experience.
AI × CryptoBullishCoinTelegraph · Mar 46/105
🤖A Bitcoin Policy Institute study of 36 AI models revealed that Bitcoin was the preferred monetary choice in 48% of responses, though over half of AI models favored stablecoins for payment scenarios. The research highlights emerging preferences of AI systems in monetary selection.
$BTC
AINeutralThe Register – AI · Mar 36/10
🧠Microsoft is reportedly considering an E7 licensing tier specifically designed to monetize AI agents in enterprise environments. This new pricing model would treat AI agents similarly to human employees in terms of software licensing costs.
AIBearisharXiv – CS AI · Mar 36/107
🧠Researchers argue that LLM-based AI agents are not yet effective for social simulation, despite growing optimism in the field. The paper identifies systematic mismatches between what current agent systems produce and what scientific simulation requires, calling for more rigorous validation frameworks.
$OP
AINeutralarXiv – CS AI · Mar 37/106
🧠Researchers developed SkillFortify, the first formal analysis framework for securing AI agent skill supply chains, addressing critical vulnerabilities exposed by attacks like ClawHavoc that infiltrated over 1,200 malicious skills. The framework achieved 96.95% F1 score with 100% precision and zero false positives in detecting malicious AI agent skills.
AI × CryptoBullisharXiv – CS AI · Mar 37/109
🤖Researchers have developed the Agent Economic Sovereignty Protocol (AESP), a new framework that allows AI agents to conduct autonomous financial transactions at machine speed while maintaining human control and governance boundaries. The protocol uses five key mechanisms including policy engines, human oversight, dual-signed commitments, privacy preservation, and cryptographic substrates to ensure agents remain economically capable but never fully sovereign.
AIBullisharXiv – CS AI · Mar 37/108
🧠Researchers propose WirelessAgent++, an automated framework for designing AI agent workflows in wireless networks using Monte Carlo Tree Search. The system achieves superior performance on wireless tasks with test scores up to 97%, outperforming existing methods by up to 31% while maintaining low computational costs under $5 per task.
AINeutralarXiv – CS AI · Mar 36/107
🧠Researchers introduce Theory of Code Space (ToCS), a new benchmark that evaluates AI agents' ability to understand software architecture across multi-file codebases. The study reveals significant performance gaps between frontier LLM agents and rule-based baselines, with F1 scores ranging from 0.129 to 0.646.
AIBullisharXiv – CS AI · Mar 36/107
🧠Researchers have developed ContextCov, a framework that converts passive natural language instructions for AI agents into active, executable guardrails to prevent code violations. The system addresses 'Context Drift' where AI agents deviate from project guidelines, creating automated compliance checks across static code analysis, runtime commands, and architectural validation.
$COMP
AIBearisharXiv – CS AI · Mar 37/107
🧠A new research paper analyzes economic equilibria between AI and human agents in trading scenarios, finding that unless agents can at least double their marginal utility from purchases, no trading will occur. The study reveals that more powerful AI agents may contribute zero utility to less capable agents in certain equilibria.
AIBullisharXiv – CS AI · Mar 37/109
🧠SimAB is a new system that uses persona-conditioned AI agents to simulate A/B tests for rapid design evaluation without requiring real user traffic. The system achieved 67% overall accuracy against 47 historical A/B tests, rising to 83% for high-confidence cases, potentially transforming how companies validate design decisions.
AIBullisharXiv – CS AI · Mar 37/104
🧠Researchers propose combining In-Weight Learning (IWL) and In-Context Learning (ICL) through modular memory architectures to solve continual learning challenges in AI. The framework aims to enable AI agents to continuously adapt and accumulate knowledge without catastrophic forgetting, addressing key limitations of current foundation models.
AIBullisharXiv – CS AI · Mar 36/105
🧠Researchers introduce 'semi-formal reasoning' for LLM agents to analyze code semantics without execution, showing significant accuracy improvements across multiple tasks. The methodology achieves 88-93% accuracy on patch verification and 87% on code question answering, potentially enabling practical applications in automated code review and static analysis.
AIBullisharXiv – CS AI · Mar 36/103
🧠Researchers have developed State-aware Reasoning (StaR), a new multimodal AI method that significantly improves AI agents' ability to interact with graphical user interfaces, particularly with toggle controls. The method enables agents to better perceive current states and execute instructions accordingly, improving toggle execution accuracy by over 30%.
AINeutralarXiv – CS AI · Mar 36/104
🧠Researchers introduced EHR-ChatQA, a new benchmark for testing AI agents that interact with Electronic Health Record databases through natural language queries. The benchmark reveals significant reliability gaps in current state-of-the-art LLMs, with success rates dropping substantially when consistency across multiple trials is required.
AINeutralarXiv – CS AI · Mar 35/103
🧠Researchers developed AWARE-US, a system to improve AI agents' ability to handle failed database queries by intelligently relaxing the least important user constraints rather than simply returning 'no results'. The system uses three LLM-based methods to infer constraint importance from dialogue, achieving up to 56% accuracy in correct constraint relaxation.
AINeutralarXiv – CS AI · Mar 35/104
🧠Researchers introduced SimuHome, a high-fidelity smart home simulator and benchmark with 600 episodes for testing LLM-based smart home agents. The system uses the Matter protocol standard and enables time-accelerated simulation to evaluate how AI agents handle device control, environmental monitoring, and workflow scheduling in smart homes.
AIBullisharXiv – CS AI · Mar 36/102
🧠Researchers introduced SWE-MiniSandbox, a container-free method for training software engineering AI agents using reinforcement learning that reduces disk usage to 5% and environment setup time to 25% of traditional container-based approaches. The system uses kernel-level isolation and lightweight pre-caching instead of bulky container images while maintaining comparable performance.
AIBullisharXiv – CS AI · Mar 36/103
🧠Researchers propose HIMM, a new memory framework for AI embodied agents that separates episodic and semantic memory to improve long-term performance. The system achieves significant gains on benchmarks, with 7.3% improvement in LLM-Match and 11.4% in LLM MatchXSPL, addressing key challenges in deploying multimodal language models as embodied agent brains.