AI Pulse News

Models, papers, tools. 17,641 articles with AI-powered sentiment analysis and key takeaways.

17641 articles

AIBullisharXiv – CS AI · Mar 37/103

🧠

Large Language Model-Assisted UAV Operations and Communications: A Multifaceted Survey and Tutorial

Researchers have published a comprehensive survey exploring the integration of Large Language Models (LLMs) with Uncrewed Aerial Vehicles (UAVs), proposing a unified framework for intelligent drone operations. The study examines how LLMs can enhance UAV capabilities including swarm coordination, navigation, mission planning, and human-drone interaction through advanced reasoning and multimodal processing.

AIBullisharXiv – CS AI · Mar 37/104

🧠

HalluGuard: Demystifying Data-Driven and Reasoning-Driven Hallucinations in LLMs

Researchers introduce HalluGuard, a new framework that identifies and addresses both data-driven and reasoning-driven hallucinations in Large Language Models. The system achieved state-of-the-art performance across 10 benchmarks and 9 LLM backbones, offering a unified approach to improve AI reliability in critical domains like healthcare and law.

AIBullisharXiv – CS AI · Mar 37/102

🧠

ButterflyMoE: Sub-Linear Ternary Experts via Structured Butterfly Orbits

ButterflyMoE introduces a breakthrough approach to reduce memory requirements for AI expert models by 150× through geometric parameterization instead of storing independent weight matrices. The method uses shared ternary prototypes with learned rotations to achieve sub-linear memory scaling, enabling deployment of multiple experts on edge devices.

AINeutralarXiv – CS AI · Mar 37/103

🧠

Reward Models Inherit Value Biases from Pretraining

A comprehensive study of 10 leading reward models reveals they inherit significant value biases from their base language models, with Llama-based models preferring 'agency' values while Gemma-based models favor 'communion' values. This bias persists even when using identical preference data and training processes, suggesting that the choice of base model fundamentally shapes AI alignment outcomes.

AIBullisharXiv – CS AI · Mar 37/104

🧠

AgentOCR: Reimagining Agent History via Optical Self-Compression

Researchers introduce AgentOCR, a framework that converts AI agent interaction histories from text to compressed visual format, reducing token usage by over 50% while maintaining 95% performance. The system uses visual caching and adaptive compression to address memory bottlenecks in large language model deployments.

AIBullisharXiv – CS AI · Mar 37/104

🧠

A Learnable Wavelet Transformer for Long-Short Equity Trading and Risk-Adjusted Return Optimization

Researchers developed WaveLSFormer, a wavelet-based Transformer model that directly generates market-neutral long/short trading portfolios from financial time series data. The AI system achieved a 60.7% cumulative return and 2.16 Sharpe ratio across six industry groups, significantly outperforming traditional ML models like LSTM and standard Transformers.

AINeutralarXiv – CS AI · Mar 37/103

🧠

When Agents "Misremember" Collectively: Exploring the Mandela Effect in LLM-based Multi-Agent Systems

Researchers have identified and studied the 'Mandela effect' in AI multi-agent systems, where groups of AI agents collectively develop false memories or misremember information. The study introduces MANBENCH, a benchmark to evaluate this phenomenon, and proposes mitigation strategies that achieved a 74.40% reduction in false collective memories.

AIBullisharXiv – CS AI · Mar 37/103

🧠

WAXAL: A Large-Scale Multilingual African Language Speech Corpus

Researchers have released WAXAL, a large-scale multilingual speech dataset covering 24 Sub-Saharan African languages representing over 100 million speakers. The dataset includes 1,250 hours of transcribed speech for ASR and 235 hours of high-quality recordings for TTS, released under CC-BY-4.0 license to advance inclusive AI technologies.

AIBullisharXiv – CS AI · Mar 37/104

🧠

Beyond Single-Modal Analytics: A Framework for Integrating Heterogeneous LLM-Based Query Systems for Multi-Modal Data

Researchers introduce Meta Engine, a unified semantic query system that integrates multiple specialized LLM-based query systems to handle multi-modal data analysis. The system addresses fragmentation in current semantic query tools by combining specialized systems through five key components, achieving 3-24x better performance than existing baselines.

AIBullisharXiv – CS AI · Mar 37/103

🧠

Intrinsic Task Symmetry Drives Generalization in Algorithmic Tasks

Researchers propose that intrinsic task symmetries drive 'grokking' - the sudden transition from memorization to generalization in neural networks. The study identifies a three-stage training process and introduces diagnostic tools to predict and accelerate the onset of generalization in algorithmic reasoning tasks.

AIBullisharXiv – CS AI · Mar 37/104

🧠

UrbanFM: Scaling Urban Spatio-Temporal Foundation Models

Researchers developed UrbanFM, a foundation model for urban spatio-temporal data that can analyze traffic patterns and city dynamics across over 100 global cities. The model demonstrates zero-shot generalization capabilities, meaning it can make predictions for unseen cities without additional training, potentially revolutionizing urban planning and smart city applications.

AIBullisharXiv – CS AI · Mar 37/103

🧠

Dream2Learn: Structured Generative Dreaming for Continual Learning

Researchers introduce Dream2Learn (D2L), a continual learning framework that enables AI models to generate synthetic training data from their own internal representations, mimicking human dreaming for knowledge consolidation. The system creates novel 'dreamed classes' using diffusion models to improve forward knowledge transfer and prevent catastrophic forgetting in neural networks.

AIBullisharXiv – CS AI · Mar 37/103

🧠

CharacterFlywheel: Scaling Iterative Improvement of Engaging and Steerable LLMs in Production

Meta presents CharacterFlywheel, an iterative process for improving large language models in production social chat applications across Instagram, WhatsApp, and Messenger. Starting from LLaMA 3.1, the system achieved significant improvements through 15 generations of refinement, with the best models showing up to 8.8% improvement in engagement breadth and 19.4% in engagement depth while substantially improving instruction following capabilities.

AINeutralarXiv – CS AI · Mar 37/103

🧠

MMR-Life: Piecing Together Real-life Scenes for Multimodal Multi-image Reasoning

Researchers introduced MMR-Life, a comprehensive benchmark with 2,646 questions and 19,108 real-world images to evaluate multimodal reasoning capabilities of AI models. Even top models like GPT-5 achieved only 58% accuracy, highlighting significant challenges in real-world multimodal reasoning across seven different reasoning types.

AINeutralarXiv – CS AI · Mar 37/104

🧠

Selection as Power: Constrained Reinforcement for Bounded Decision Authority

Researchers extend the "Selection as Power" framework to dynamic settings, introducing constrained reinforcement learning that maintains bounded decision authority in AI systems. The study demonstrates that governance constraints can prevent AI systems from collapsing into deterministic dominance while still allowing adaptive improvement through controlled parameter updates.

AINeutralarXiv – CS AI · Mar 37/104

🧠

Revealing Combinatorial Reasoning of GNNs via Graph Concept Bottleneck Layer

Researchers developed a new graph concept bottleneck layer (GCBM) that can be integrated into Graph Neural Networks to make their decision-making process more interpretable. The method treats graph concepts as 'words' and uses language models to improve understanding of how GNNs make predictions, achieving state-of-the-art performance in both classification accuracy and interpretability.

AIBullisharXiv – CS AI · Mar 37/104

🧠

HEAPr: Hessian-based Efficient Atomic Expert Pruning in Output Space

Researchers introduce HEAPr, a novel pruning algorithm for Mixture-of-Experts (MoE) language models that decomposes experts into atomic components for more precise pruning. The method achieves nearly lossless compression at 20-25% pruning ratios while reducing computational costs by approximately 20%.

AIBullisharXiv – CS AI · Mar 37/104

🧠

Learning from Synthetic Data Improves Multi-hop Reasoning

Researchers demonstrated that large language models can improve multi-hop reasoning performance by training on rule-generated synthetic data instead of expensive human annotations or frontier LLM outputs. The study found that LLMs trained on synthetic fictional data performed better on real-world question-answering benchmarks by learning fundamental knowledge composition skills.

AIBullisharXiv – CS AI · Mar 37/104

🧠

AgentMath: Empowering Mathematical Reasoning for Large Language Models via Tool-Augmented Agent

Researchers introduced AgentMath, a new AI framework that combines language models with code interpreters to solve complex mathematical problems more efficiently than current Large Reasoning Models. The system achieves state-of-the-art performance on mathematical competition benchmarks, with AgentMath-30B-A3B reaching 90.6% accuracy on AIME24 while remaining competitive with much larger models like OpenAI-o3.

AIBullisharXiv – CS AI · Mar 37/103

🧠

GenDB: The Next Generation of Query Processing -- Synthesized, Not Engineered

Researchers propose GenDB, a revolutionary database system that uses Large Language Models to synthesize query execution code instead of relying on traditional engineered query processors. Early prototype testing shows GenDB outperforms established systems like DuckDB, Umbra, and PostgreSQL on OLAP workloads.

AI × CryptoBullisharXiv – CS AI · Mar 37/103

🤖

SymGPT: Auditing Smart Contracts via Combining Symbolic Execution with Large Language Models

Researchers have developed SymGPT, a new tool that combines large language models with symbolic execution to automatically audit smart contracts for ERC rule violations. The tool identified 5,783 violations in 4,000 real-world contracts, including 1,375 with clear attack paths for financial theft, outperforming existing automated analysis methods.

$ETH

AI × CryptoBullisharXiv – CS AI · Mar 37/104

🤖

TAO: Tolerance-Aware Optimistic Verification for Floating-Point Neural Networks

TAO is a new verification protocol that enables users to verify neural network outputs from untrusted cloud services without requiring exact computation matches. The system uses tolerance-aware verification with IEEE-754 bounds and empirical profiles, implementing a dispute resolution mechanism deployed on Ethereum testnet.

$ETH$TAO

AINeutralarXiv – CS AI · Mar 37/103

🧠

On the Rate of Convergence of GD in Non-linear Neural Networks: An Adversarial Robustness Perspective

Researchers prove that gradient descent in neural networks converges to optimal robustness margins at an extremely slow rate of Θ(1/ln(t)), even in simplified two-neuron settings. This establishes the first explicit lower bound on convergence rates for robustness margins in non-linear models, revealing fundamental limitations in neural network training efficiency.

AINeutralarXiv – CS AI · Mar 37/104

🧠

Safety Mirage: How Spurious Correlations Undermine VLM Safety Fine-Tuning and Can Be Mitigated by Machine Unlearning

Researchers identify a 'safety mirage' problem in vision language models where supervised fine-tuning creates spurious correlations that make models vulnerable to simple attacks and overly cautious with benign queries. They propose machine unlearning as an alternative that reduces attack success rates by up to 60.27% and unnecessary rejections by over 84.20%.

AIBullisharXiv – CS AI · Mar 37/103

🧠

Robometer: Scaling General-Purpose Robotic Reward Models via Trajectory Comparisons

Researchers introduce Robometer, a new framework for training robot reward models that combines progress tracking with trajectory comparisons to better learn from failed attempts. The system is trained on RBM-1M, a dataset of over one million robot trajectories including failures, and shows improved performance across diverse robotics applications.

← PrevPage 159 of 706Next →