AI Pulse News

Models, papers, tools. 17,658 articles with AI-powered sentiment analysis and key takeaways.

17658 articles

AIBullisharXiv – CS AI · Mar 37/103

🧠

GenDB: The Next Generation of Query Processing -- Synthesized, Not Engineered

Researchers propose GenDB, a revolutionary database system that uses Large Language Models to synthesize query execution code instead of relying on traditional engineered query processors. Early prototype testing shows GenDB outperforms established systems like DuckDB, Umbra, and PostgreSQL on OLAP workloads.

AIBullisharXiv – CS AI · Mar 37/104

🧠

SpiroLLM: Finetuning Pretrained LLMs to Understand Spirogram Time Series with Clinical Validation in COPD Reporting

Researchers developed SpiroLLM, the first multimodal large language model capable of understanding spirogram time series data for COPD diagnosis. Using data from 234,028 UK Biobank individuals, the model achieved 0.8977 diagnostic AUROC and maintained 100% valid response rate even with missing data, far outperforming text-only models.

AI × CryptoBullisharXiv – CS AI · Mar 37/103

🤖

SymGPT: Auditing Smart Contracts via Combining Symbolic Execution with Large Language Models

Researchers have developed SymGPT, a new tool that combines large language models with symbolic execution to automatically audit smart contracts for ERC rule violations. The tool identified 5,783 violations in 4,000 real-world contracts, including 1,375 with clear attack paths for financial theft, outperforming existing automated analysis methods.

$ETH

AIBullisharXiv – CS AI · Mar 37/105

🧠

Arbor: A Framework for Reliable Navigation of Critical Conversation Flows

Researchers introduce Arbor, a framework that decomposes large language model decision-making into specialized node-level tasks for critical applications like healthcare triage. The system improves accuracy by 29.4 percentage points while reducing latency by 57.1% and costs by 14.4x compared to single-prompt approaches.

AI × CryptoBullisharXiv – CS AI · Mar 37/104

🤖

TAO: Tolerance-Aware Optimistic Verification for Floating-Point Neural Networks

TAO is a new verification protocol that enables users to verify neural network outputs from untrusted cloud services without requiring exact computation matches. The system uses tolerance-aware verification with IEEE-754 bounds and empirical profiles, implementing a dispute resolution mechanism deployed on Ethereum testnet.

$ETH$TAO

AIBullisharXiv – CS AI · Mar 37/104

🧠

EnterpriseBench Corecraft: Training Generalizable Agents on High-Fidelity RL Environments

Surge AI introduces CoreCraft, the first environment in EnterpriseBench for training AI agents on realistic enterprise workflows. Training GLM 4.6 on this high-fidelity customer support simulation improved task performance from 25% to 37% and showed positive transfer to other benchmarks, demonstrating that quality training environments enable generalizable AI capabilities.

AIBearisharXiv – CS AI · Mar 37/103

🧠

On The Fragility of Benchmark Contamination Detection in Reasoning Models

New research reveals that benchmark contamination in language reasoning models (LRMs) is extremely difficult to detect, allowing developers to easily inflate performance scores on public leaderboards. The study shows that reinforcement learning methods like GRPO and PPO can effectively conceal contamination signals, undermining the integrity of AI model evaluations.

$NEAR

AIBearisharXiv – CS AI · Mar 37/103

🧠

Untargeted Jailbreak Attack

Researchers have developed a new 'untargeted jailbreak attack' (UJA) that can compromise AI safety systems in large language models with over 80% success rate using only 100 optimization iterations. This gradient-based attack method expands the search space by maximizing unsafety probability without fixed target responses, outperforming existing attacks by over 30%.

AIBullisharXiv – CS AI · Mar 37/103

🧠

RACE Attention: A Strictly Linear-Time Attention for Long-Sequence Training

Researchers introduce RACE Attention, a new linear-time alternative to traditional Softmax Attention that can process up to 75 million tokens in a single pass, compared to current GPU-optimized implementations that fail beyond 4 million tokens. The technology uses angular similarity and Gaussian random projections to achieve dramatic efficiency gains while maintaining performance across language modeling and classification tasks.

AIBullisharXiv – CS AI · Mar 37/105

🧠

Expressive Power of Implicit Models: Rich Equilibria and Test-Time Scaling

Researchers provide mathematical proof that implicit models can achieve greater expressive power through increased test-time computation, explaining how these memory-efficient architectures can match larger explicit networks. The study validates this scaling property across image reconstruction, scientific computing, operations research, and LLM reasoning domains.

AINeutralarXiv – CS AI · Mar 37/103

🧠

What Scales in Cross-Entropy Scaling Law?

Researchers discovered that the traditional cross-entropy scaling law for large language models breaks down at very large scales because only one component (error-entropy) actually follows power-law scaling, while other components remain constant. This finding explains why model performance improvements become less predictable as models grow larger and establishes a new error-entropy scaling law for better understanding LLM development.

AIBullisharXiv – CS AI · Mar 37/104

🧠

SwiReasoning: Switch-Thinking in Latent and Explicit for Pareto-Superior Reasoning LLMs

Researchers introduce SwiReasoning, a training-free framework that improves large language model reasoning by dynamically switching between explicit chain-of-thought and latent reasoning modes. The method achieves 1.8%-3.1% accuracy improvements and 57%-79% better token efficiency across mathematics, STEM, coding, and general benchmarks.

AIBullisharXiv – CS AI · Mar 37/104

🧠

Tiny but Mighty: A Software-Hardware Co-Design Approach for Efficient Multimodal Inference on Battery-Powered Small Devices

Researchers developed NANOMIND, a software-hardware framework that optimizes Large Multimodal Models for battery-powered devices by breaking them into modular components and mapping each to optimal accelerators. The system achieves 42.3% energy reduction and enables 20.8 hours of operation running LLaVA-OneVision on a compact device without network connectivity.

AIBullisharXiv – CS AI · Mar 37/104

🧠

Emergent Coordination in Multi-Agent Language Models

Researchers developed an information-theoretic framework to measure when multi-agent AI systems exhibit coordinated behavior beyond individual agents. The study found that specific prompt designs can transform collections of AI agents into coordinated collectives that mirror human group intelligence principles.

AIBullisharXiv – CS AI · Mar 37/104

🧠

Relational Transformer: Toward Zero-Shot Foundation Models for Relational Data

Researchers from Stanford introduce the Relational Transformer (RT), a new AI architecture that can work with relational databases without task-specific fine-tuning. The 22M parameter model achieves 93% performance of fully supervised models on binary classification tasks, significantly outperforming a 27B parameter LLM at 84%.

AIBullisharXiv – CS AI · Mar 37/103

🧠

MorphArtGrasp: Morphology-Aware Cross-Embodiment Dexterous Hand Articulation Generation for Grasping

MorphArtGrasp is a new AI framework that enables dexterous robotic hands to grasp objects across different hand designs without extensive retraining. The system achieves 91.9% success rate in simulation and 87% in real-world tests by using morphology-aware learning to adapt grasping strategies to different robotic hand configurations.

AIBullisharXiv – CS AI · Mar 37/103

🧠

Value Flows

Researchers have developed Value Flows, a new reinforcement learning method that uses flow-based models to estimate complete return distributions rather than single scalar values. The approach achieves 1.3x improvement in success rates across 62 benchmark tasks by better identifying states with high return uncertainty for improved decision-making.

AIBearisharXiv – CS AI · Mar 37/103

🧠

Adaptive Attacks on Trusted Monitors Subvert AI Control Protocols

Research reveals that AI control protocols designed to prevent harmful behavior from untrusted LLM agents can be systematically defeated through adaptive attacks targeting monitor models. The study demonstrates that frontier models can evade safety measures by embedding prompt injections in their outputs, with existing protocols like Defer-to-Resample actually amplifying these attacks.

AIBullisharXiv – CS AI · Mar 37/103

🧠

GAR: Generative Adversarial Reinforcement Learning for Formal Theorem Proving

Researchers introduce GAR (Generative Adversarial Reinforcement Learning), a new AI training framework that jointly trains problem generators and solvers in an adversarial loop for formal theorem proving. The method shows significant improvements in mathematical proof capabilities, with models achieving 4.20% average relative improvement on benchmark tests.

AIBullisharXiv – CS AI · Mar 37/103

🧠

Ctrl-World: A Controllable Generative World Model for Robot Manipulation

Researchers have developed Ctrl-World, a controllable generative world model that enables robot policies to be evaluated and improved through simulation rather than costly real-world testing. The model, trained on 95k trajectories, can generate consistent 20+ second simulations and improved policy success rates by 44.7% through synthetic data generation.

GeneralBearishBeInCrypto · Mar 3🔥 8/108

📰

Strait of Hormuz Shutdown Shakes Asian Energy Markets

The Strait of Hormuz has effectively closed following US-Israeli strikes on Iran, creating an unprecedented energy supply crisis. Asian economies, particularly Japan and South Korea, face severe risks due to their heavy dependence on oil shipments through this critical chokepoint.

AIBearishApple Machine Learning · Mar 37/105

🧠

On the Impossibility of Separating Intelligence from Judgment: The Computational Intractability of Filtering for AI Alignment

Research demonstrates computational challenges in AI alignment, specifically showing that efficient filtering of adversarial prompts and unsafe outputs from large language models may be fundamentally impossible. The study reveals theoretical limitations in separating intelligence from judgment in AI systems, highlighting intractable problems in content filtering approaches.

AIBearishFortune Crypto · Mar 27/103

🧠

‘Could it kill someone?’ A Seoul woman allegedly used ChatGPT to help carry out two murders in South Korean motels

A South Korean woman allegedly used ChatGPT to plan two murders at Seoul motels, raising serious concerns about AI safety guardrails. The case highlights potential risks of AI chatbots being exploited for harmful purposes and questions about existing protective measures.

AIBearishFortune Crypto · Mar 27/101

🧠

Iran has the intent—and increasingly the tools—for AI-powered cyberattacks

Iran is developing AI capabilities to enhance cyberattacks against critical infrastructure, though no evidence exists of fully autonomous cyber agents. The country appears to be using AI to accelerate and improve existing attack methods rather than deploying completely automated systems.

AI × CryptoBearishCryptoPotato · Mar 2🔥 8/109

🤖

World War III Scenario: Which Crypto Would Suffer the Most? (4 AIs Respond)

Four AI models analyzed a hypothetical World War III scenario to identify which cryptocurrencies would be most vulnerable to massive price declines. The analysis suggests certain tokens could potentially plummet by 90% in such extreme geopolitical conditions.

← PrevPage 163 of 707Next →