AIBearisharXiv – CS AI · 4d ago7/10
🧠A large-scale empirical study of EvoMap, an agent-to-agent collaboration network, reveals critical structural flaws: 98% of assets go unused despite incentive mechanisms, quality scoring systems are easily manipulated through self-reported metadata, and over 84% of assets bypass quality checks through vacuous validation. The findings highlight fundamental challenges in designing trustworthy decentralized AI ecosystems that balance scalability with verifiable execution.
AIBullisharXiv – CS AI · May 127/10
🧠Researchers introduce NEXUS, a framework enabling embodied AI agents to learn symbolic constraints for safer decision-making in physical environments. The system addresses the gap between probabilistic language models and the deterministic safety requirements of robotics by decoupling physical feasibility from safety specifications, achieving improved task success while refusing unsafe instructions.
DeFiBullishcrypto.news · May 117/10
💎Boundary Labs, backed by Galaxy Ventures, is launching USBD, an over-collateralized Ethereum stablecoin that replaces traditional monthly reserve attestations with continuous on-chain verification. The protocol separates yield generation into a distinct sUSBD token targeting institutional investors, aiming to create a more transparent and verifiable dollar alternative.
$ETH
AIBullisharXiv – CS AI · May 117/10
🧠Researchers introduce MAVEN, a multi-agent framework that enhances large language model reasoning through explicit role-separation and intermediate verification steps. The system outperforms existing approaches on multiple benchmarks by creating verifiable, modular deliberation trajectories rather than relying on implicit reasoning or post-hoc consensus mechanisms.
AIBullisharXiv – CS AI · May 117/10
🧠BEAVER is a new verification framework that computes mathematically sound probability bounds on whether large language models satisfy safety properties, identifying 2-3x more risky outputs than existing methods while using 90% less computational resources. The framework addresses a critical gap in LLM deployment by providing deterministic guarantees rather than ad-hoc sampling estimates.
AIBearisharXiv – CS AI · Apr 207/10
🧠Researchers introduced ASMR-Bench, a benchmark for detecting sabotage in ML research codebases, revealing that current frontier LLMs and human auditors struggle to identify subtle implementation flaws that produce misleading results. The study found even the best-performing model (Gemini 3.1 Pro) achieved only 77% AUROC and 42% fix rate, highlighting critical vulnerabilities in AI-assisted research validation.
🧠 Gemini
AIBullisharXiv – CS AI · Apr 107/10
🧠Researchers propose Symbolic Equivalence Partitioning, a novel inference-time selection method for code generation that uses symbolic execution and SMT constraints to identify correct solutions without expensive external verifiers. The approach improves accuracy on HumanEval+ by 10.3% and on LiveCodeBench by 17.1% at N=10 without requiring additional LLM inference.
AIBearisharXiv – CS AI · Apr 77/10
🧠Researchers prove a fundamental theoretical limit in AI safety verification using Kolmogorov complexity theory. They demonstrate that no finite formal verifier can certify all policy-compliant AI instances of arbitrarily high complexity, revealing intrinsic information-theoretic barriers beyond computational constraints.
AINeutralarXiv – CS AI · Apr 67/10
🧠A new research paper presents a structured framework for translating high-level EU AI Act requirements into concrete, verifiable assessment activities across the AI lifecycle. The mapping aims to reduce interpretive uncertainty and provide consistent compliance verification mechanisms for high-risk AI systems under the new regulation.
AIBullisharXiv – CS AI · Apr 67/10
🧠SentinelAgent introduces a formal framework for securing multi-agent AI systems through verifiable delegation chains, achieving 100% accuracy in testing with zero false positives. The system uses seven verification properties and a non-LLM authority service to ensure secure delegation between AI agents in federal environments.
AIBullisharXiv – CS AI · Mar 277/10
🧠Researchers introduce cross-model disagreement as a training-free method to detect when AI language models make confident errors without requiring ground truth labels. The approach uses Cross-Model Perplexity and Cross-Model Entropy to measure how surprised a second verifier model is when reading another model's answers, significantly outperforming existing uncertainty-based methods across multiple benchmarks.
🏢 Perplexity
DeFiBullishThe Block · Mar 266/10
💎BlackRock's BUIDL fund, the world's largest tokenized fund managing $1.7 billion in Treasuries and cash, has integrated Chronicle as a new verification layer. This development strengthens the infrastructure supporting tokenized traditional financial assets.
CryptoBearishCryptoPotato · Mar 157/10
⛓️A CertiK report reveals that crypto ATM fraud has surged dramatically, resulting in $333 million in losses during 2025. The fraud exploits crypto ATMs' minimal verification requirements and fast transaction processing, allowing criminals to quickly convert cash into digital assets before victims can detect the fraudulent activity.
AI × CryptoNeutralarXiv – CS AI · Mar 127/10
🤖Researchers propose NabaOS, a lightweight verification framework that detects AI agent hallucinations using HMAC-signed tool receipts instead of zero-knowledge proofs. The system achieves 94.2% detection accuracy with <15ms verification time, compared to cryptographic approaches that require 180+ seconds per query.
AI × CryptoBullisharXiv – CS AI · Mar 97/10
🤖Researchers propose 'proof-of-guardrail' system that uses cryptographic proof and Trusted Execution Environments to verify AI agent safety measures. The system allows users to cryptographically verify that AI responses were generated after specific open-source safety guardrails were executed, addressing concerns about falsely advertised safety measures.
AIBullisharXiv – CS AI · Mar 57/10
🧠Researchers propose LEAP, a new framework for detecting AI hallucinations using efficient small models that can dynamically adapt verification strategies. The system uses a teacher-student approach where a powerful model trains smaller ones to detect false outputs, addressing a critical barrier to safe AI deployment in production environments.
AINeutralarXiv – CS AI · Mar 57/10
🧠Researchers propose a new framework for Agentic Peer-to-Peer Networks where AI agents on edge devices can collaborate by sharing capabilities and actions rather than static files. The system introduces tiered verification methods to ensure security and reliability when AI agents delegate tasks to untrusted peers in decentralized networks.
AI × CryptoBullishCoinDesk · Mar 47/103
🤖The Ethereum Foundation, through AI lead Davide Crapis, is positioning Ethereum to serve as a trust layer for artificial intelligence applications. The foundation envisions the network functioning as a coordination and verification infrastructure in a world increasingly dominated by AI-mediated interactions.
$ETH
AIBullisharXiv – CS AI · Mar 46/102
🧠Researchers introduce RIVA, a multi-agent AI system that uses specialized verification agents and cross-validation to detect infrastructure configuration drift more reliably. The system improves accuracy from 27.3% to 50% when dealing with erroneous tool responses, addressing a critical reliability issue in cloud infrastructure management.
AIBullisharXiv – CS AI · Mar 46/104
🧠Researchers have developed a framework that allows neural network verification tools to accept natural language specifications instead of low-level technical constraints. The system automatically translates human-readable requirements into formal verification queries, significantly expanding the practical applicability of neural network verification across diverse domains.
AIBullisharXiv – CS AI · Mar 37/104
🧠Researchers have developed Hierarchical Speculative Decoding (HSD), a new method that significantly improves AI inference speed while maintaining accuracy by solving joint intractability problems in verification processes. The technique shows over 12% performance gains when integrated with existing frameworks like EAGLE-3, establishing new state-of-the-art efficiency standards.
AI × CryptoBullisharXiv – CS AI · Mar 37/104
🤖TAO is a new verification protocol that enables users to verify neural network outputs from untrusted cloud services without requiring exact computation matches. The system uses tolerance-aware verification with IEEE-754 bounds and empirical profiles, implementing a dispute resolution mechanism deployed on Ethereum testnet.
$ETH$TAO
AINeutralarXiv – CS AI · Mar 37/104
🧠Researchers have developed VeriTrail, the first closed-domain hallucination detection method that can trace where AI-generated misinformation originates in multi-step processes. The system addresses a critical problem where language models generate unsubstantiated content even when instructed to stick to source material, with the risk being higher in complex multi-step generative processes.
AI × CryptoBullishCoinTelegraph – AI · Feb 107/106
🤖Ethereum co-founder Vitalik Buterin outlined how Ethereum could integrate with AI systems by providing privacy infrastructure, verification mechanisms, and economic layers. This integration aims to help decentralize AI development and create broader societal benefits through blockchain-based solutions.
$ETH
AINeutralarXiv – CS AI · 2d ago6/10
🧠Researchers propose a hybrid reasoning system that combines Large Language Models with preference-based Maximum Satisfiability solvers to tackle complex optimization problems with multiple constraints. The approach achieves over 80% correctness rates on preference-based reasoning tasks, substantially outperforming traditional LLM baselines that rarely produce feasible solutions.