#ai-agents News & Analysis
Coverage of #ai-agents has generated 98 articles over the past month, with 61.2% maintaining a bullish sentiment. Discussion remains stable compared to the previous quarter, reflecting consistent interest rather than sudden shifts in outlook. The conversation centers on major AI models including GPT-5 and Claude, with substantial research contributions tracked through arXiv's computer science and AI channels alongside cryptocurrency-focused outlets.
The topic frequently intersects with machine learning, large language models, and automation research, while also appearing alongside discussions of blockchain assets like Ethereum and Bitcoin. Scan the articles below to explore how #ai-agents are being developed, deployed, and analyzed across technical and financial perspectives.
sentiment · last 30d (98 articles)Top sources:arXiv – CS AI · 243Crypto Briefing · 19CoinDesk · 18Fortune Crypto · 12TechCrunch – AI · 12
Most-discussed entities:GPT-5 · 13Claude · 13Anthropic · 10OpenAI · 9Opus · 6
AI × CryptoNeutralCrypto Briefing · Mar 37/103
🤖Haseeb Qureshi discusses how AI agents are becoming proficient in cybercrime activities, while crypto still faces fundamental usability challenges rooted in underlying technology. He argues that smart contracts cannot completely replace traditional legal agreements in complex financial arrangements.
AIBullishCrypto Briefing · Mar 37/102
🧠Emad Mostaque predicts AI agents will become mainstream this year, reducing operational friction and boosting profitability across industries. He suggests the future of AI development will move beyond transformer architectures, promising unprecedented efficiency gains that could reshape economic landscapes.
AINeutralarXiv – CS AI · Mar 37/104
🧠Researchers introduce PsyAgent, a new AI framework that creates human-like agents by combining personality modeling based on Big Five traits with contextual social awareness. The system uses structured prompts and fine-tuning to produce AI agents that maintain stable personality traits while adapting appropriately to different social situations and roles.
AIBullisharXiv – CS AI · Mar 37/102
🧠Researchers have developed FM Agent, a multi-agent AI framework that combines large language models with evolutionary search to autonomously solve complex research problems. The system achieved state-of-the-art results across multiple domains including operations research, machine learning, and GPU optimization without human intervention.
AIBullisharXiv – CS AI · Mar 37/104
🧠Researchers introduced AgentMath, a new AI framework that combines language models with code interpreters to solve complex mathematical problems more efficiently than current Large Reasoning Models. The system achieves state-of-the-art performance on mathematical competition benchmarks, with AgentMath-30B-A3B reaching 90.6% accuracy on AIME24 while remaining competitive with much larger models like OpenAI-o3.
AIBearisharXiv – CS AI · Mar 37/104
🧠Researchers have developed AudAgent, an automated tool that monitors AI agents in real-time to ensure they comply with their stated privacy policies. The tool revealed that many AI agents powered by major providers like Claude, Gemini, and DeepSeek fail to protect highly sensitive data like SSNs and violate their own privacy policies.
$LINK
AIBullisharXiv – CS AI · Mar 37/103
🧠Researchers introduce PolySkill, a framework that enables AI agents to learn generalizable skills by separating abstract goals from concrete implementations, inspired by software engineering polymorphism. The method improves skill reuse by 1.7x and boosts success rates by up to 13.9% on web navigation tasks while reducing execution steps by over 20%.
AINeutralarXiv – CS AI · Mar 37/103
🧠Researchers introduce InnoGym, the first benchmark designed to evaluate AI agents' innovation potential rather than just correctness. The framework measures both performance gains and methodological novelty across 18 real-world engineering and scientific tasks, revealing that while AI agents can generate novel approaches, they lack robustness for significant performance improvements.
AINeutralarXiv – CS AI · Mar 37/104
🧠Researchers introduce GLEE, a new framework for studying how Large Language Models behave in economic games and strategic interactions. The study reveals that LLM performance in economic scenarios depends heavily on market parameters and model selection, with complex interdependent effects on outcomes.
AIBullisharXiv – CS AI · Mar 37/103
🧠Researchers propose GenDB, a revolutionary database system that uses Large Language Models to synthesize query execution code instead of relying on traditional engineered query processors. Early prototype testing shows GenDB outperforms established systems like DuckDB, Umbra, and PostgreSQL on OLAP workloads.
AIBullisharXiv – CS AI · Mar 37/104
🧠Surge AI introduces CoreCraft, the first environment in EnterpriseBench for training AI agents on realistic enterprise workflows. Training GLM 4.6 on this high-fidelity customer support simulation improved task performance from 25% to 37% and showed positive transfer to other benchmarks, demonstrating that quality training environments enable generalizable AI capabilities.
AIBullisharXiv – CS AI · Mar 37/104
🧠Researchers introduce AgentOCR, a framework that converts AI agent interaction histories from text to compressed visual format, reducing token usage by over 50% while maintaining 95% performance. The system uses visual caching and adaptive compression to address memory bottlenecks in large language model deployments.
AIBullisharXiv – CS AI · Mar 37/103
🧠Researchers have developed MagicAgent, a series of foundation models designed for generalized AI agent planning that outperforms existing sub-100B models and even surpasses leading ultra-scale models like GPT-5.2. The models achieve superior performance through a novel synthetic data framework and two-stage training paradigm that addresses gradient interference in multi-task learning.
AI × CryptoBullishCoinTelegraph – AI · Feb 277/108
🤖Alchemy has launched autonomous payment rails for AI agents on the Base blockchain, enabling automated payments for blockchain data and compute credits using USDC. This development supports the growing trend of autonomous crypto applications by providing seamless payment infrastructure for AI-driven systems.
AI × CryptoBullishBitcoinist · Feb 277/104
🤖Ethereum is emerging as the dominant blockchain for AI agent development, expanding beyond its traditional DeFi leadership role. The network is now positioning itself as the primary platform for on-chain AI innovation, demonstrating constructive rather than speculative growth.
$ETH
AI × CryptoBullishThe Block · Feb 277/107
🤖DeFi leaders from Ondo and Galaxy Digital discuss the emerging trends of Real World Assets (RWAs), AI integration, and tokenized equities in decentralized finance. The conversation explores how AI agents are expected to transform DeFi trading practices and provides a bullish outlook despite current market conditions.
AIBullishOpenAI News · Feb 277/106
🧠OpenAI and Amazon have announced a strategic partnership that will integrate OpenAI's Frontier platform with AWS infrastructure. The collaboration aims to expand AI capabilities through enhanced infrastructure, custom model development, and enterprise AI agent solutions.
AIBullishOpenAI News · Feb 277/105
🧠Amazon Bedrock introduces a new Stateful Runtime Environment for AI agents that provides persistent orchestration, memory capabilities, and secure execution for complex multi-step AI workflows. The service leverages OpenAI technology to enable more sophisticated AI agent operations with maintained state across interactions.
AIBearisharXiv – CS AI · Feb 277/104
🧠Research reveals that autonomous AI agents competing for limited resources form distinct tribal behaviors, with three main types emerging: Aggressive (27.3%), Conservative (24.7%), and Opportunistic (48.1%). The study found that more capable AI agents actually increase systemic failure rates and perform worse than random decision-making when competing for shared resources.
$NEAR
AIBullisharXiv – CS AI · Feb 277/107
🧠Researchers have developed Exgentic, a new framework for evaluating general-purpose AI agents that can perform tasks across different environments without domain-specific tuning. The study benchmarked five prominent agent implementations and found that general agents can achieve performance comparable to specialized agents, establishing the first Open General Agent Leaderboard.
AIBullisharXiv – CS AI · Feb 277/107
🧠Researchers introduce Contextual Memory Virtualisation (CMV), a system that preserves LLM understanding across extended sessions by treating context as version-controlled state using DAG-based management. The system includes a trimming algorithm that reduces token counts by 20-86% while preserving all user interactions, demonstrating particular efficiency in tool-use sessions.
AIBullisharXiv – CS AI · Feb 277/107
🧠Researchers introduce OmniGAIA, a comprehensive benchmark for evaluating omni-modal AI agents that can process video, audio, and image data simultaneously with complex reasoning capabilities. They also propose OmniAtlas, a foundation agent that enhances existing open-source models' ability to use tools across multiple modalities, marking progress toward more capable AI assistants.
AIBullisharXiv – CS AI · Feb 277/107
🧠Researchers introduce SUPERGLASSES, the first comprehensive benchmark for evaluating Vision Language Models in AI smart glasses applications, comprising 2,422 real-world egocentric image-question pairs. They also propose SUPERLENS, a multimodal agent that outperforms GPT-4o by 2.19% through retrieval-augmented answer generation with automatic object detection and web search capabilities.
AIBearisharXiv – CS AI · Feb 277/105
🧠Researchers discovered a new vulnerability called 'silent egress' where LLM agents can be tricked into leaking sensitive data through malicious URL previews without detection. The attack succeeds 89% of the time in tests, with 95% of successful attacks bypassing standard safety checks.
AIBullisharXiv – CS AI · Feb 277/104
🧠Researchers have released MiroFlow, an open-source AI agent framework designed to overcome limitations of current LLM-based systems in complex real-world tasks. The framework features agent graph orchestration, deep reasoning capabilities, and robust workflow execution, achieving state-of-the-art performance across multiple benchmarks including GAIA and FutureX.