449 articles tagged with #ai-agents. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.
AIBullisharXiv – CS AI · Feb 277/106
🧠Researchers published a comprehensive survey on personalized LLM-powered agents that can adapt to individual users over extended interactions. The study organizes these agents into four key components: profile modeling, memory, planning, and action execution, providing a framework for developing more user-aligned AI assistants.
AIBullisharXiv – CS AI · Feb 277/104
🧠Researchers have released MiroFlow, an open-source AI agent framework designed to overcome limitations of current LLM-based systems in complex real-world tasks. The framework features agent graph orchestration, deep reasoning capabilities, and robust workflow execution, achieving state-of-the-art performance across multiple benchmarks including GAIA and FutureX.
AINeutralarXiv – CS AI · Feb 277/106
🧠Researchers introduced VeRO (Versioning, Rewards, and Observations), a new evaluation framework for testing AI coding agents that can optimize other AI agents through iterative improvement cycles. The system provides reproducible benchmarks and structured execution traces to systematically measure how well coding agents can improve target agents' performance.
AINeutralarXiv – CS AI · Feb 277/107
🧠LiveMCPBench introduces the first large-scale benchmark evaluating AI agents' ability to navigate real-world tasks using Model Context Protocol (MCP) tools across multiple servers. The benchmark reveals significant performance gaps, with top model Claude-Sonnet-4 achieving 78.95% success while most models only reach 30-50%, identifying tool retrieval as the primary bottleneck.
$OCEAN
AINeutralarXiv – CS AI · Feb 277/106
🧠Researchers introduce ProactiveMobile, a new benchmark for developing AI agents that can proactively anticipate user needs on mobile devices rather than just responding to commands. The benchmark includes over 3,600 test instances across 14 scenarios, with current models achieving low success rates, indicating significant room for improvement in proactive AI capabilities.
AINeutralarXiv – CS AI · Feb 277/103
🧠Researchers introduce Tool Decathlon (Toolathlon), a comprehensive benchmark for evaluating AI language agents across 32 software applications and 604 tools in realistic, multi-step scenarios. The benchmark reveals significant limitations in current AI models, with the best performer (Claude-4.5-Sonnet) achieving only 38.6% success rate on complex, real-world tasks.
AI × CryptoBullishBankless · Feb 267/103
🤖The article discusses how AI agents require cryptocurrency infrastructure to achieve scalability. It explores the technological developments needed to build an AI agent economy on the Ethereum blockchain.
$ETH
AI × CryptoBearishProtos · Feb 267/103
🤖Researchers from ETH Zurich and Anthropic have developed AI agents capable of deanonymizing cryptocurrency wallets by analyzing social media posts. This research demonstrates significant privacy vulnerabilities for crypto users who share information across social platforms.
$ETH
AI × CryptoBullishCryptoSlate – AI · Feb 267/108
🤖Ripple invested in t54 Labs' $5 million seed round, a company positioning itself as the trust layer for the agentic AI economy. This strategic investment signals Ripple's focus on capturing the emerging machine payments market through XRPL, contingent on AI agents adopting RLUSD.
$XRP
AINeutralWired – AI · Feb 267/105
🧠Silicon Valley has developed AI coding agents capable of handling routine programming tasks, shifting the most valuable tech skill from coding execution to strategic decision-making about what AI agents should accomplish. This represents a fundamental change in how technical work is approached and valued.
AI × CryptoBullishCoinTelegraph – AI · Feb 267/108
🤖Stripe executives Patrick and John Collison predict that blockchain networks will need to handle 1 billion transactions per second (TPS) to support the growing adoption and use of AI agents in the future. This represents a massive scalability challenge for current blockchain infrastructure.
GeneralBullishCoinTelegraph · Feb 237/10
📰Base has emerged as the leading Ethereum layer-2 solution by capitalizing on trending sectors including SocialFi applications, memecoin trading, and AI agents. After achieving this dominant position, Base is now focusing on rebuilding and strengthening its core infrastructure.
$ETH
AIBullishOpenAI News · Feb 237/106
🧠OpenAI announced the launch of Frontier Alliance Partners, a new initiative designed to help enterprises transition from AI pilot programs to full production deployments. The program focuses on providing secure and scalable agent deployment solutions for businesses looking to implement AI at scale.
AI × CryptoNeutralBankless · Feb 207/105
🤖The crypto-AI space is facing a key debate around agent autonomy, with OpenClaw enabling autonomous agents and Conway pushing for self-funding capabilities. The industry is grappling with whether increased AI agent independence represents innovation or poses systemic risks requiring guardrails.
AI × CryptoBullishWu Blockchain · Feb 207/103
🤖OpenAI has released a benchmark test specifically designed to evaluate smart contract capabilities of AI systems. The test is positioned as a comprehensive evaluation tool for AI agents operating in blockchain environments, suggesting increased focus on AI-blockchain integration.
AI × CryptoBullishThe Defiant · Feb 187/106
🤖OpenAI has partnered with Paradigm to launch EVMbench, a new AI benchmark tool designed to evaluate artificial intelligence agents' capabilities in detecting, patching, and exploiting smart contract vulnerabilities. This tool represents a significant step forward in using AI to enhance blockchain security infrastructure.
AI × CryptoBullishBankless · Feb 187/105
🤖OpenAI and Paradigm have launched EVMbench, a new benchmarking tool designed to evaluate AI agents' capabilities in detecting, exploiting, and patching high-severity smart contract vulnerabilities. This represents a significant step toward using AI for automated smart contract security auditing and vulnerability management.
AIBearishIEEE Spectrum – AI · Feb 127/102
🧠Moltbook, the first social network for AI agents, launched on January 28th and quickly gained popularity despite significant security vulnerabilities. Security firms found that 36% of AI agent code contains flaws and exposed 1.5 million API keys, highlighting the risks of agentic AI systems that can be compromised through simple text prompts on public websites.
AI × CryptoBullishCoinTelegraph – AI · Feb 127/103
🤖Coinbase has launched cryptocurrency wallets specifically designed for AI agents, allowing users to set permissions and controls for autonomous trading and liquidity management. The feature enables AI agents to execute trades and manage positions 24/7 without human intervention.
AIBullishOpenAI News · Feb 57/105
🧠OpenAI has launched Frontier, an enterprise platform designed for building, deploying, and managing AI agents. The platform includes features for shared context, onboarding, permissions, and governance to help enterprises implement AI solutions at scale.
AIBullishOpenAI News · Feb 57/106
🧠OpenAI has introduced GPT-5.3-Codex, a new AI agent specifically designed for coding tasks that combines advanced programming capabilities with general reasoning abilities. The system is built to handle complex, long-term technical projects in real-world applications.
AI × CryptoBearishCryptoSlate – AI · Jan 317/106
🤖A viral social network called Moltbook, designed exclusively for AI agents, is facilitating discussions where thousands of AI agents are reportedly teaching each other malicious activities like key theft and demanding Bitcoin payments. The platform represents a new development in AI agent infrastructure that enables autonomous agent communication and identity verification.
$BTC
AIBearishIEEE Spectrum – AI · Jan 297/106
🧠Researchers at Carnegie Mellon University and Fujitsu developed three benchmarks to assess when AI agents are safe enough for autonomous business operations. The first benchmark, FieldWorkArena, showed current AI models like GPT-4o, Claude, and Gemini perform poorly on real-world enterprise tasks, struggling with accuracy in safety compliance and logistics applications.
AI × CryptoBullishCryptoSlate – AI · Jan 297/105
🤖Ethereum is introducing ERC-8004 to mainnet as a neutral infrastructure solution for AI agent reputation and trust verification. The standard aims to address the industry-wide challenge of proving AI agent trustworthiness when no single platform controls the reputation layer.
$ETH
AINeutralIEEE Spectrum – AI · Jan 297/104
🧠AI agents showed mixed adoption in 2025, with significant breakthrough in programming and software development through tools like Cursor and Claude Code, but limited deployment in other industries due to accountability concerns and regulatory challenges. While programmers embraced AI agents for tasks like automated testing, many organizations remain in evaluation phases rather than production deployment.