#ai-agents News & Analysis

Coverage of #ai-agents has generated 98 articles over the past month, with 61.2% maintaining a bullish sentiment. Discussion remains stable compared to the previous quarter, reflecting consistent interest rather than sudden shifts in outlook. The conversation centers on major AI models including GPT-5 and Claude, with substantial research contributions tracked through arXiv's computer science and AI channels alongside cryptocurrency-focused outlets. The topic frequently intersects with machine learning, large language models, and automation research, while also appearing alongside discussions of blockchain assets like Ethereum and Bitcoin. Scan the articles below to explore how #ai-agents are being developed, deployed, and analyzed across technical and financial perspectives.

sentiment · last 30d (98 articles)

Top sources:arXiv – CS AI · 243Crypto Briefing · 19CoinDesk · 18Fortune Crypto · 12TechCrunch – AI · 12

Often co-tagged with:#machine-learning #llm #research #automation #enterprise-ai #open-source

Most-discussed entities:GPT-5 · 13Claude · 13Anthropic · 10OpenAI · 9Opus · 6

902 articles

AI × CryptoBullishBitcoinist · Feb 277/104

🤖

Ethereum Network Takes The Crown As The Home Of On-Chain AI Agents

Ethereum is emerging as the dominant blockchain for AI agent development, expanding beyond its traditional DeFi leadership role. The network is now positioning itself as the primary platform for on-chain AI innovation, demonstrating constructive rather than speculative growth.

$ETH

AI × CryptoBullishThe Block · Feb 277/107

🤖

The surge of RWAs, AI and tokenized equities, with Galaxy and Ondo

DeFi leaders from Ondo and Galaxy Digital discuss the emerging trends of Real World Assets (RWAs), AI integration, and tokenized equities in decentralized finance. The conversation explores how AI agents are expected to transform DeFi trading practices and provides a bullish outlook despite current market conditions.

AIBullishOpenAI News · Feb 277/106

🧠

OpenAI and Amazon announce strategic partnership

OpenAI and Amazon have announced a strategic partnership that will integrate OpenAI's Frontier platform with AWS infrastructure. The collaboration aims to expand AI capabilities through enhanced infrastructure, custom model development, and enterprise AI agent solutions.

AIBullishOpenAI News · Feb 277/105

🧠

Introducing the Stateful Runtime Environment for Agents in Amazon Bedrock

Amazon Bedrock introduces a new Stateful Runtime Environment for AI agents that provides persistent orchestration, memory capabilities, and secure execution for complex multi-step AI workflows. The service leverages OpenAI technology to enable more sophisticated AI agent operations with maintained state across interactions.

AIBullisharXiv – CS AI · Feb 277/107

🧠

SUPERGLASSES: Benchmarking Vision Language Models as Intelligent Agents for AI Smart Glasses

Researchers introduce SUPERGLASSES, the first comprehensive benchmark for evaluating Vision Language Models in AI smart glasses applications, comprising 2,422 real-world egocentric image-question pairs. They also propose SUPERLENS, a multimodal agent that outperforms GPT-4o by 2.19% through retrieval-augmented answer generation with automatic object detection and web search capabilities.

AINeutralarXiv – CS AI · Feb 277/106

🧠

VeRO: An Evaluation Harness for Agents to Optimize Agents

Researchers introduced VeRO (Versioning, Rewards, and Observations), a new evaluation framework for testing AI coding agents that can optimize other AI agents through iterative improvement cycles. The system provides reproducible benchmarks and structured execution traces to systematically measure how well coding agents can improve target agents' performance.

AIBullisharXiv – CS AI · Feb 277/106

🧠

Toward Personalized LLM-Powered Agents: Foundations, Evaluation, and Future Directions

Researchers published a comprehensive survey on personalized LLM-powered agents that can adapt to individual users over extended interactions. The study organizes these agents into four key components: profile modeling, memory, planning, and action execution, providing a framework for developing more user-aligned AI assistants.

AINeutralarXiv – CS AI · Feb 277/107

🧠

LiveMCPBench: Can Agents Navigate an Ocean of MCP Tools?

LiveMCPBench introduces the first large-scale benchmark evaluating AI agents' ability to navigate real-world tasks using Model Context Protocol (MCP) tools across multiple servers. The benchmark reveals significant performance gaps, with top model Claude-Sonnet-4 achieving 78.95% success while most models only reach 30-50%, identifying tool retrieval as the primary bottleneck.

$OCEAN

AINeutralarXiv – CS AI · Feb 277/107

🧠

Vibe Researching as Wolf Coming: Can AI Agents with Skills Replace or Augment Social Scientists?

A research paper introduces the concept of 'vibe researching' where AI agents can autonomously execute entire research pipelines from idea to submission using specialized skills. The study analyzes how AI agents excel at speed and methodological tasks but struggle with theoretical originality and tacit knowledge, creating a cognitive rather than sequential delegation boundary in research workflows.

AINeutralarXiv – CS AI · Feb 277/106

🧠

ProactiveMobile: A Comprehensive Benchmark for Boosting Proactive Intelligence on Mobile Devices

Researchers introduce ProactiveMobile, a new benchmark for developing AI agents that can proactively anticipate user needs on mobile devices rather than just responding to commands. The benchmark includes over 3,600 test instances across 14 scenarios, with current models achieving low success rates, indicating significant room for improvement in proactive AI capabilities.

AIBullisharXiv – CS AI · Feb 277/107

🧠

OmniGAIA: Towards Native Omni-Modal AI Agents

Researchers introduce OmniGAIA, a comprehensive benchmark for evaluating omni-modal AI agents that can process video, audio, and image data simultaneously with complex reasoning capabilities. They also propose OmniAtlas, a foundation agent that enhances existing open-source models' ability to use tools across multiple modalities, marking progress toward more capable AI assistants.

AINeutralarXiv – CS AI · Feb 277/103

🧠

The Tool Decathlon: Benchmarking Language Agents for Diverse, Realistic, and Long-Horizon Task Execution

Researchers introduce Tool Decathlon (Toolathlon), a comprehensive benchmark for evaluating AI language agents across 32 software applications and 604 tools in realistic, multi-step scenarios. The benchmark reveals significant limitations in current AI models, with the best performer (Claude-4.5-Sonnet) achieving only 38.6% success rate on complex, real-world tasks.

AIBullisharXiv – CS AI · Feb 277/105

🧠

Agent Behavioral Contracts: Formal Specification and Runtime Enforcement for Reliable Autonomous AI Agents

Researchers introduce Agent Behavioral Contracts (ABC), a formal framework for specifying and enforcing reliable behavior in autonomous AI agents. The system addresses critical issues of drift and governance failures in AI deployments by implementing runtime-enforceable contracts that achieve 88-100% compliance rates and significantly improve violation detection.

AIBullisharXiv – CS AI · Feb 277/104

🧠

MiroFlow: Towards High-Performance and Robust Open-Source Agent Framework for General Deep Research Tasks

Researchers have released MiroFlow, an open-source AI agent framework designed to overcome limitations of current LLM-based systems in complex real-world tasks. The framework features agent graph orchestration, deep reasoning capabilities, and robust workflow execution, achieving state-of-the-art performance across multiple benchmarks including GAIA and FutureX.

AIBearisharXiv – CS AI · Feb 277/105

🧠

Silent Egress: When Implicit Prompt Injection Makes LLM Agents Leak Without a Trace

Researchers discovered a new vulnerability called 'silent egress' where LLM agents can be tricked into leaking sensitive data through malicious URL previews without detection. The attack succeeds 89% of the time in tests, with 95% of successful attacks bypassing standard safety checks.

AIBullisharXiv – CS AI · Feb 277/107

🧠

Contextual Memory Virtualisation: DAG-Based State Management and Structurally Lossless Trimming for LLM Agents

Researchers introduce Contextual Memory Virtualisation (CMV), a system that preserves LLM understanding across extended sessions by treating context as version-controlled state using DAG-based management. The system includes a trimming algorithm that reduces token counts by 20-86% while preserving all user interactions, demonstrating particular efficiency in tool-use sessions.

AIBullisharXiv – CS AI · Feb 277/107

🧠

General Agent Evaluation

Researchers have developed Exgentic, a new framework for evaluating general-purpose AI agents that can perform tasks across different environments without domain-specific tuning. The study benchmarked five prominent agent implementations and found that general agents can achieve performance comparable to specialized agents, establishing the first Open General Agent Leaderboard.

AIBearisharXiv – CS AI · Feb 277/104

🧠

Three AI-agents walk into a bar . . . . `Lord of the Flies' tribalism emerges among smart AI-Agents

Research reveals that autonomous AI agents competing for limited resources form distinct tribal behaviors, with three main types emerging: Aggressive (27.3%), Conservative (24.7%), and Opportunistic (48.1%). The study found that more capable AI agents actually increase systemic failure rates and perform worse than random decision-making when competing for shared resources.

$NEAR

AI × CryptoBullishBankless · Feb 267/103

🤖

Building the Agent Economy on Ethereum

The article discusses how AI agents require cryptocurrency infrastructure to achieve scalability. It explores the technological developments needed to build an AI agent economy on the Ethereum blockchain.

$ETH

AI × CryptoBearishProtos · Feb 267/103

🤖

AI agents want to identify your crypto wallet using social media

Researchers from ETH Zurich and Anthropic have developed AI agents capable of deanonymizing cryptocurrency wallets by analyzing social media posts. This research demonstrates significant privacy vulnerabilities for crypto users who share information across social platforms.

$ETH

AI × CryptoBullishCryptoSlate – AI · Feb 267/108

🤖

XRPL could capture billions in machine payments but only if AI agents choose RLUSD

Ripple invested in t54 Labs' $5 million seed round, a company positioning itself as the trust layer for the agentic AI economy. This strategic investment signals Ripple's focus on capturing the emerging machine payments market through XRPL, contingent on AI agents adopting RLUSD.

$XRP

AINeutralWired – AI · Feb 267/105

🧠

Are You ‘Agentic’ Enough for the AI Era?

Silicon Valley has developed AI coding agents capable of handling routine programming tasks, shifting the most valuable tech skill from coding execution to strategic decision-making about what AI agents should accomplish. This represents a fundamental change in how technical work is approached and valued.

AI × CryptoBullishCoinTelegraph – AI · Feb 267/108

🤖

Blockchains may need 1B TPS to support AI agent future: Stripe

Stripe executives Patrick and John Collison predict that blockchain networks will need to handle 1 billion transactions per second (TPS) to support the growing adoption and use of AI agents in the future. This represents a massive scalability challenge for current blockchain infrastructure.

AI × CryptoBullishCoinTelegraph – AI · Feb 237/105

🤖

How SocialFi, memecoins and AI pushed Base to the top of the L2 ladder

Base has emerged as the leading Ethereum layer-2 solution by capitalizing on trending sectors including SocialFi applications, memecoin trading, and AI agents. After achieving this market leadership position, Base is now focusing on strengthening its fundamental infrastructure and core technology stack.

$ETH

AIBullishOpenAI News · Feb 237/106

🧠

OpenAI announces Frontier Alliance Partners

OpenAI announced the launch of Frontier Alliance Partners, a new initiative designed to help enterprises transition from AI pilot programs to full production deployments. The program focuses on providing secure and scalable agent deployment solutions for businesses looking to implement AI at scale.

← PrevPage 18 of 37Next →