Models, papers, tools. 17,641 articles with AI-powered sentiment analysis and key takeaways.
AIBullisharXiv – CS AI · Mar 37/103
🧠Researchers have published a comprehensive survey exploring the integration of Large Language Models (LLMs) with Uncrewed Aerial Vehicles (UAVs), proposing a unified framework for intelligent drone operations. The study examines how LLMs can enhance UAV capabilities including swarm coordination, navigation, mission planning, and human-drone interaction through advanced reasoning and multimodal processing.
AIBullisharXiv – CS AI · Mar 37/104
🧠Researchers introduce HalluGuard, a new framework that identifies and addresses both data-driven and reasoning-driven hallucinations in Large Language Models. The system achieved state-of-the-art performance across 10 benchmarks and 9 LLM backbones, offering a unified approach to improve AI reliability in critical domains like healthcare and law.
AIBullisharXiv – CS AI · Mar 37/102
🧠ButterflyMoE introduces a breakthrough approach to reduce memory requirements for AI expert models by 150× through geometric parameterization instead of storing independent weight matrices. The method uses shared ternary prototypes with learned rotations to achieve sub-linear memory scaling, enabling deployment of multiple experts on edge devices.
AINeutralarXiv – CS AI · Mar 37/103
🧠A comprehensive study of 10 leading reward models reveals they inherit significant value biases from their base language models, with Llama-based models preferring 'agency' values while Gemma-based models favor 'communion' values. This bias persists even when using identical preference data and training processes, suggesting that the choice of base model fundamentally shapes AI alignment outcomes.
AIBullisharXiv – CS AI · Mar 37/104
🧠Researchers introduce AgentOCR, a framework that converts AI agent interaction histories from text to compressed visual format, reducing token usage by over 50% while maintaining 95% performance. The system uses visual caching and adaptive compression to address memory bottlenecks in large language model deployments.
AIBullisharXiv – CS AI · Mar 37/104
🧠Researchers developed WaveLSFormer, a wavelet-based Transformer model that directly generates market-neutral long/short trading portfolios from financial time series data. The AI system achieved a 60.7% cumulative return and 2.16 Sharpe ratio across six industry groups, significantly outperforming traditional ML models like LSTM and standard Transformers.
AINeutralarXiv – CS AI · Mar 37/103
🧠Researchers have identified and studied the 'Mandela effect' in AI multi-agent systems, where groups of AI agents collectively develop false memories or misremember information. The study introduces MANBENCH, a benchmark to evaluate this phenomenon, and proposes mitigation strategies that achieved a 74.40% reduction in false collective memories.
AIBullisharXiv – CS AI · Mar 37/103
🧠Researchers have released WAXAL, a large-scale multilingual speech dataset covering 24 Sub-Saharan African languages representing over 100 million speakers. The dataset includes 1,250 hours of transcribed speech for ASR and 235 hours of high-quality recordings for TTS, released under CC-BY-4.0 license to advance inclusive AI technologies.
AIBullisharXiv – CS AI · Mar 37/104
🧠Researchers introduce Meta Engine, a unified semantic query system that integrates multiple specialized LLM-based query systems to handle multi-modal data analysis. The system addresses fragmentation in current semantic query tools by combining specialized systems through five key components, achieving 3-24x better performance than existing baselines.
AIBullisharXiv – CS AI · Mar 37/103
🧠Researchers propose that intrinsic task symmetries drive 'grokking' - the sudden transition from memorization to generalization in neural networks. The study identifies a three-stage training process and introduces diagnostic tools to predict and accelerate the onset of generalization in algorithmic reasoning tasks.
AIBullisharXiv – CS AI · Mar 37/104
🧠Researchers developed UrbanFM, a foundation model for urban spatio-temporal data that can analyze traffic patterns and city dynamics across over 100 global cities. The model demonstrates zero-shot generalization capabilities, meaning it can make predictions for unseen cities without additional training, potentially revolutionizing urban planning and smart city applications.
AIBullisharXiv – CS AI · Mar 37/103
🧠Researchers introduce Dream2Learn (D2L), a continual learning framework that enables AI models to generate synthetic training data from their own internal representations, mimicking human dreaming for knowledge consolidation. The system creates novel 'dreamed classes' using diffusion models to improve forward knowledge transfer and prevent catastrophic forgetting in neural networks.
AIBullisharXiv – CS AI · Mar 37/103
🧠Meta presents CharacterFlywheel, an iterative process for improving large language models in production social chat applications across Instagram, WhatsApp, and Messenger. Starting from LLaMA 3.1, the system achieved significant improvements through 15 generations of refinement, with the best models showing up to 8.8% improvement in engagement breadth and 19.4% in engagement depth while substantially improving instruction following capabilities.
AINeutralarXiv – CS AI · Mar 37/103
🧠Researchers introduced MMR-Life, a comprehensive benchmark with 2,646 questions and 19,108 real-world images to evaluate multimodal reasoning capabilities of AI models. Even top models like GPT-5 achieved only 58% accuracy, highlighting significant challenges in real-world multimodal reasoning across seven different reasoning types.
AINeutralarXiv – CS AI · Mar 37/104
🧠Researchers extend the "Selection as Power" framework to dynamic settings, introducing constrained reinforcement learning that maintains bounded decision authority in AI systems. The study demonstrates that governance constraints can prevent AI systems from collapsing into deterministic dominance while still allowing adaptive improvement through controlled parameter updates.
AINeutralarXiv – CS AI · Mar 37/104
🧠Researchers developed a new graph concept bottleneck layer (GCBM) that can be integrated into Graph Neural Networks to make their decision-making process more interpretable. The method treats graph concepts as 'words' and uses language models to improve understanding of how GNNs make predictions, achieving state-of-the-art performance in both classification accuracy and interpretability.
AIBullisharXiv – CS AI · Mar 37/104
🧠Researchers introduce HEAPr, a novel pruning algorithm for Mixture-of-Experts (MoE) language models that decomposes experts into atomic components for more precise pruning. The method achieves nearly lossless compression at 20-25% pruning ratios while reducing computational costs by approximately 20%.
AIBullisharXiv – CS AI · Mar 37/104
🧠Researchers demonstrated that large language models can improve multi-hop reasoning performance by training on rule-generated synthetic data instead of expensive human annotations or frontier LLM outputs. The study found that LLMs trained on synthetic fictional data performed better on real-world question-answering benchmarks by learning fundamental knowledge composition skills.
AIBullisharXiv – CS AI · Mar 37/104
🧠Researchers introduced AgentMath, a new AI framework that combines language models with code interpreters to solve complex mathematical problems more efficiently than current Large Reasoning Models. The system achieves state-of-the-art performance on mathematical competition benchmarks, with AgentMath-30B-A3B reaching 90.6% accuracy on AIME24 while remaining competitive with much larger models like OpenAI-o3.
AIBullisharXiv – CS AI · Mar 37/103
🧠Researchers propose GenDB, a revolutionary database system that uses Large Language Models to synthesize query execution code instead of relying on traditional engineered query processors. Early prototype testing shows GenDB outperforms established systems like DuckDB, Umbra, and PostgreSQL on OLAP workloads.
AI × CryptoBullisharXiv – CS AI · Mar 37/103
🤖Researchers have developed SymGPT, a new tool that combines large language models with symbolic execution to automatically audit smart contracts for ERC rule violations. The tool identified 5,783 violations in 4,000 real-world contracts, including 1,375 with clear attack paths for financial theft, outperforming existing automated analysis methods.
$ETH
AI × CryptoBullisharXiv – CS AI · Mar 37/104
🤖TAO is a new verification protocol that enables users to verify neural network outputs from untrusted cloud services without requiring exact computation matches. The system uses tolerance-aware verification with IEEE-754 bounds and empirical profiles, implementing a dispute resolution mechanism deployed on Ethereum testnet.
$ETH$TAO
AINeutralarXiv – CS AI · Mar 37/103
🧠Researchers prove that gradient descent in neural networks converges to optimal robustness margins at an extremely slow rate of Θ(1/ln(t)), even in simplified two-neuron settings. This establishes the first explicit lower bound on convergence rates for robustness margins in non-linear models, revealing fundamental limitations in neural network training efficiency.
AINeutralarXiv – CS AI · Mar 37/104
🧠Researchers identify a 'safety mirage' problem in vision language models where supervised fine-tuning creates spurious correlations that make models vulnerable to simple attacks and overly cautious with benign queries. They propose machine unlearning as an alternative that reduces attack success rates by up to 60.27% and unnecessary rejections by over 84.20%.
AIBullisharXiv – CS AI · Mar 37/103
🧠Researchers introduce Robometer, a new framework for training robot reward models that combines progress tracking with trajectory comparisons to better learn from failed attempts. The system is trained on RBM-1M, a dataset of over one million robot trajectories including failures, and shows improved performance across diverse robotics applications.