AI Pulse News

Models, papers, tools. 17,651 articles with AI-powered sentiment analysis and key takeaways.

17651 articles

AINeutralarXiv – CS AI · Mar 37/103

🧠

MMR-Life: Piecing Together Real-life Scenes for Multimodal Multi-image Reasoning

Researchers introduced MMR-Life, a comprehensive benchmark with 2,646 questions and 19,108 real-world images to evaluate multimodal reasoning capabilities of AI models. Even top models like GPT-5 achieved only 58% accuracy, highlighting significant challenges in real-world multimodal reasoning across seven different reasoning types.

AINeutralarXiv – CS AI · Mar 37/103

🧠

On the Rate of Convergence of GD in Non-linear Neural Networks: An Adversarial Robustness Perspective

Researchers prove that gradient descent in neural networks converges to optimal robustness margins at an extremely slow rate of Θ(1/ln(t)), even in simplified two-neuron settings. This establishes the first explicit lower bound on convergence rates for robustness margins in non-linear models, revealing fundamental limitations in neural network training efficiency.

AIBullisharXiv – CS AI · Mar 37/103

🧠

Robometer: Scaling General-Purpose Robotic Reward Models via Trajectory Comparisons

Researchers introduce Robometer, a new framework for training robot reward models that combines progress tracking with trajectory comparisons to better learn from failed attempts. The system is trained on RBM-1M, a dataset of over one million robot trajectories including failures, and shows improved performance across diverse robotics applications.

AIBullisharXiv – CS AI · Mar 37/103

🧠

CharacterFlywheel: Scaling Iterative Improvement of Engaging and Steerable LLMs in Production

Meta presents CharacterFlywheel, an iterative process for improving large language models in production social chat applications across Instagram, WhatsApp, and Messenger. Starting from LLaMA 3.1, the system achieved significant improvements through 15 generations of refinement, with the best models showing up to 8.8% improvement in engagement breadth and 19.4% in engagement depth while substantially improving instruction following capabilities.

AIBullisharXiv – CS AI · Mar 37/103

🧠

Learning Robust Intervention Representations with Delta Embeddings

Researchers propose Causal Delta Embeddings, a new method for learning robust AI representations from image pairs that improves out-of-distribution performance. The approach focuses on representing interventions in causal models rather than just scene variables, achieving significant improvements in synthetic and real-world benchmarks without additional supervision.

AIBullisharXiv – CS AI · Mar 37/102

🧠

The FM Agent

Researchers have developed FM Agent, a multi-agent AI framework that combines large language models with evolutionary search to autonomously solve complex research problems. The system achieved state-of-the-art results across multiple domains including operations research, machine learning, and GPU optimization without human intervention.

AIBullisharXiv – CS AI · Mar 37/103

🧠

Model Predictive Adversarial Imitation Learning for Planning from Observation

Researchers have developed a new approach called Model Predictive Adversarial Imitation Learning that combines inverse reinforcement learning with model predictive control to enable AI agents to learn from incomplete human demonstrations. The method shows significant improvements in sample efficiency, generalization, and robustness compared to traditional imitation learning approaches.

AIBullisharXiv – CS AI · Mar 37/103

🧠

SageBwd: A Trainable Low-bit Attention

Researchers have developed SageBwd, a trainable INT8 attention mechanism that can match full-precision attention performance during pre-training while quantizing six of seven attention matrix multiplications. The study identifies key factors for stable training including QK-norm requirements and the impact of tokens per step on quantization errors.

AI × CryptoBullisharXiv – CS AI · Mar 37/103

🤖

SymGPT: Auditing Smart Contracts via Combining Symbolic Execution with Large Language Models

Researchers have developed SymGPT, a new tool that combines large language models with symbolic execution to automatically audit smart contracts for ERC rule violations. The tool identified 5,783 violations in 4,000 real-world contracts, including 1,375 with clear attack paths for financial theft, outperforming existing automated analysis methods.

$ETH

AI × CryptoBullisharXiv – CS AI · Mar 37/104

🤖

TAO: Tolerance-Aware Optimistic Verification for Floating-Point Neural Networks

TAO is a new verification protocol that enables users to verify neural network outputs from untrusted cloud services without requiring exact computation matches. The system uses tolerance-aware verification with IEEE-754 bounds and empirical profiles, implementing a dispute resolution mechanism deployed on Ethereum testnet.

$ETH$TAO

AIBullisharXiv – CS AI · Mar 37/105

🧠

HierarchicalPrune: Position-Aware Compression for Large-Scale Diffusion Models

Researchers developed HierarchicalPrune, a compression framework that reduces large-scale text-to-image diffusion models' memory footprint by 77.5-80.4% and latency by 27.9-38.0% while maintaining image quality. The technique enables billion-parameter AI models to run efficiently on resource-constrained devices through hierarchical pruning and knowledge distillation.

AINeutralarXiv – CS AI · Mar 37/104

🧠

Optimal Sparsity of Mixture-of-Experts Language Models for Reasoning Tasks

Researchers analyzed Mixture-of-Experts (MoE) language models to determine optimal sparsity levels for different tasks. They found that reasoning tasks require balancing active compute (FLOPs) with optimal data-to-parameter ratios, while memorization tasks benefit from more parameters regardless of sparsity.

AIBullisharXiv – CS AI · Mar 37/103

🧠

FROGENT: An End-to-End Full-process Drug Design Multi-Agent System

Researchers have developed FROGENT, an AI multi-agent system that uses large language models to automate the entire drug discovery pipeline from target identification to synthesis planning. The system outperformed existing AI approaches across eight benchmarks and demonstrated practical applications in real-world drug design scenarios.

AIBullisharXiv – CS AI · Mar 37/104

🧠

Learning Internal Biological Neuron Parameters and Complexity-Based Encoding for Improved Spiking Neural Networks Performance

Researchers developed a novel learning approach for spiking neural networks that optimizes both synaptic weights and intrinsic neuronal parameters, achieving up to 13.50 percentage point improvements in classification accuracy. The study introduces a biologically-inspired SNN-LZC classifier that achieves 99.50% accuracy with sub-millisecond inference latency.

AIBullisharXiv – CS AI · Mar 37/104

🧠

Disentangled Multi-modal Learning of Histology and Transcriptomics for Cancer Characterization

Researchers developed a new disentangled multi-modal framework that combines histopathology and transcriptome data for improved cancer diagnosis and prognosis. The framework addresses key challenges in medical AI including multi-modal data heterogeneity and dependency on paired datasets through innovative fusion techniques and knowledge distillation strategies.

AINeutralarXiv – CS AI · Mar 37/104

🧠

Implementing Pearl's $\mathcal{DO}$-Calculus on Quantum Circuits: A Simpson-Type Case Study on NISQ Hardware

Researchers have developed a method to implement Pearl's causal inference framework (DO-calculus) on quantum circuits, mapping causal networks to quantum hardware through 'circuit surgery.' The approach was successfully demonstrated on IonQ's quantum processor using a healthcare model, showing agreement with classical baselines.

AIBullisharXiv – CS AI · Mar 37/104

🧠

BinaryShield: Cross-Service Threat Intelligence in LLM Services using Privacy-Preserving Fingerprints

BinaryShield is the first privacy-preserving threat intelligence system that enables secure sharing of attack fingerprints across compliance boundaries for LLM services. The system addresses the critical security gap where organizations cannot share prompt injection attack intelligence between services due to privacy regulations, achieving an F1-score of 0.94 while providing 38x faster similarity search than dense embeddings.

AIBearisharXiv – CS AI · Mar 37/103

🧠

ERIS: Evolutionary Real-world Interference Scheme for Jailbreaking Audio Large Models

Researchers developed ERIS, a new framework that uses genetic algorithms to exploit Audio Large Models (ALMs) by disguising malicious instructions as natural speech with background noise. The system can bypass safety filters by embedding harmful content in real-world audio interference that appears harmless to humans and security systems.

AIBullisharXiv – CS AI · Mar 37/103

🧠

Beyond Frame-wise Tracking: A Trajectory-based Paradigm for Efficient Point Cloud Tracking

Researchers have developed TrajTrack, a new AI framework for 3D object tracking in LiDAR systems that achieves state-of-the-art performance while running at 55 FPS. The system improves tracking precision by 3.02% over existing methods by using historical trajectory data rather than computationally expensive multi-frame point cloud processing.

AIBullisharXiv – CS AI · Mar 37/103

🧠

GEM: A Gym for Agentic LLMs

Researchers introduced GEM (General Experience Maker), an open-source environment simulator designed for training large language models through experience-based learning rather than static datasets. The framework provides a standardized interface similar to OpenAI-Gym but specifically optimized for LLMs, featuring diverse environments, integrated tools, and compatibility with popular RL training frameworks.

$MKR

AIBullisharXiv – CS AI · Mar 37/104

🧠

Dense-Jump Flow Matching with Non-Uniform Time Scheduling for Robotic Policies: Mitigating Multi-Step Inference Degradation

Researchers developed a new robotic policy framework using dense-jump flow matching with non-uniform time scheduling to address performance degradation in multi-step inference. The approach achieves up to 23.7% performance gains over existing baselines by optimizing integration scheduling during training and inference phases.

AIBullisharXiv – CS AI · Mar 37/104

🧠

BWCache: Accelerating Video Diffusion Transformers through Block-Wise Caching

Researchers have developed BWCache, a training-free method that accelerates Diffusion Transformer (DiT) video generation by up to 6× through block-wise feature caching and reuse. The technique exploits computational redundancy in DiT blocks across timesteps while maintaining visual quality, addressing a key bottleneck in real-world AI video generation applications.

AINeutralarXiv – CS AI · Mar 37/104

🧠

Barriers for Learning in an Evolving World: Mathematical Understanding of Loss of Plasticity

Researchers have identified the mathematical mechanisms behind 'loss of plasticity' (LoP), explaining why deep learning models struggle to continue learning in changing environments. The study reveals that properties promoting generalization in static settings actually hinder continual learning by creating parameter space traps.

AIBullisharXiv – CS AI · Mar 37/104

🧠

Distribution-Aligned Decoding for Efficient LLM Task Adaptation

Researchers introduce SVDecode, a new method for adapting large language models to specific tasks without extensive fine-tuning. The technique uses steering vectors during decoding to align output distributions with task requirements, improving accuracy by up to 5 percentage points while adding minimal computational overhead.

AIBullisharXiv – CS AI · Mar 37/103

🧠

Stabilizing Policy Gradients for Sample-Efficient Reinforcement Learning in LLM Reasoning

Researchers have developed Curvature-Aware Policy Optimization (CAPO), a new algorithm that improves training stability and sample efficiency for Large Language Models by up to 30x. The method uses advanced mathematical optimization techniques to identify and filter problematic training samples, requiring intervention on fewer than 8% of tokens.

← PrevPage 162 of 707Next →