#formal-methods News & Analysis

7 articles tagged with #formal-methods. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

7 articles

AIBullisharXiv – CS AI · Apr 107/10

🧠

Towards provable probabilistic safety for scalable embodied AI systems

Researchers propose a shift from deterministic to probabilistic safety verification for embodied AI systems, arguing that provable probabilistic guarantees offer a more practical path to large-scale deployment in safety-critical applications like autonomous vehicles and robotics than the infeasible goal of absolute safety across all scenarios.

AIBearisharXiv – CS AI · Apr 77/10

🧠

Incompleteness of AI Safety Verification via Kolmogorov Complexity

Researchers prove a fundamental theoretical limit in AI safety verification using Kolmogorov complexity theory. They demonstrate that no finite formal verifier can certify all policy-compliant AI instances of arbitrarily high complexity, revealing intrinsic information-theoretic barriers beyond computational constraints.

AIBullisharXiv – CS AI · Apr 67/10

🧠

SentinelAgent: Intent-Verified Delegation Chains for Securing Federal Multi-Agent AI Systems

SentinelAgent introduces a formal framework for securing multi-agent AI systems through verifiable delegation chains, achieving 100% accuracy in testing with zero false positives. The system uses seven verification properties and a non-LLM authority service to ensure secure delegation between AI agents in federal environments.

AINeutralarXiv – CS AI · Apr 206/10

🧠

DPrivBench: Benchmarking LLMs' Reasoning for Differential Privacy

Researchers introduce DPrivBench, a benchmark for evaluating how well large language models can reason about differential privacy algorithms and verify their correctness. Testing shows current LLMs handle basic DP mechanisms competently but fail significantly on advanced algorithms, exposing critical gaps in automated privacy reasoning capabilities.

AINeutralarXiv – CS AI · Apr 156/10

🧠

Modeling Co-Pilots for Text-to-Model Translation

Researchers introduce Text2Model and Text2Zinc, frameworks that use large language models to translate natural language descriptions into formal optimization and satisfaction models. The work represents the first unified approach combining both problem types with a solver-agnostic architecture, though experiments reveal LLMs remain imperfect at this task despite showing competitive performance.

AI × CryptoBullisharXiv – CS AI · Apr 136/10

🤖

SPEAR: An Engineering Case Study of Multi-Agent Coordination for Smart Contract Auditing

SPEAR is a multi-agent AI framework designed to automate smart contract auditing through coordinated specialist agents that prioritize contracts, allocate tasks, and recover from failures autonomously. The research demonstrates how established multi-agent system patterns can improve security analysis workflows beyond centralized or pipeline-based approaches.

AIBullisharXiv – CS AI · Mar 126/10

🧠

FAME: Formal Abstract Minimal Explanation for Neural Networks

Researchers introduce FAME (Formal Abstract Minimal Explanations), a new method for explaining neural network decisions that scales to large networks while producing smaller explanations. The approach uses abstract interpretation and dedicated perturbation domains to eliminate irrelevant features and converge to minimal explanations more efficiently than existing methods.