Real-time AI-curated news from 31,138+ articles across 50+ sources. Sentiment analysis, importance scoring, and key takeaways — updated every 15 minutes.
AIBullisharXiv – CS AI · Mar 97/10
🧠Researchers introduce FlashPrefill, a new framework that dramatically improves Large Language Model efficiency during the prefilling phase through advanced sparse attention mechanisms. The system achieves up to 27.78x speedup on long 256K sequences while maintaining 1.71x speedup even on shorter 4K contexts.
AIBullisharXiv – CS AI · Mar 97/10
🧠Researchers introduced SPARC, a framework that creates unified latent spaces across different AI models and modalities, enabling direct comparison of how various architectures represent identical concepts. The method achieves 0.80 Jaccard similarity on Open Images, tripling alignment compared to previous methods, and enables practical applications like text-guided spatial localization in vision-only models.
AIBullisharXiv – CS AI · Mar 97/10
🧠Researchers present a comprehensive survey of Predictive Coding Networks (PCNs), a neuroscience-inspired AI approach that uses biologically plausible inference learning instead of traditional backpropagation. PCNs can achieve higher computational efficiency with parallelization and offer a more versatile framework for both supervised and unsupervised learning compared to traditional neural networks.
AIBullisharXiv – CS AI · Mar 97/10
🧠Researchers developed AIRT, an AI-powered radiation therapy planning system that generates complete prostate cancer treatment plans in under one second using deep learning. The system processes CT scans and anatomical data to produce clinically-viable radiation treatment plans 100x faster than current methods, demonstrating non-inferiority to existing commercial solutions.
🏢 Nvidia
AIBullisharXiv – CS AI · Mar 97/10
🧠Researchers introduce COLD-Steer, a training-free framework that enables efficient control of large language model behavior at inference time using just a few examples. The method approximates gradient descent effects without parameter updates, achieving 95% steering effectiveness while using 50 times fewer samples than existing approaches.
AIBullisharXiv – CS AI · Mar 97/10
🧠LUMINA is a new LLM-driven framework for GPU architecture exploration that uses AI to optimize GPU designs for modern AI workloads like LLM inference. The system achieved 17.5x higher efficiency than traditional methods and identified 6 designs superior to NVIDIA's A100 GPU using only 20 exploration steps.
AINeutralarXiv – CS AI · Mar 97/10
🧠Researchers introduce AdAEM, a new evaluation algorithm that automatically generates test questions to better assess value differences and biases across Large Language Models. Unlike static benchmarks, AdAEM adaptively creates controversial topics that reveal more distinguishable insights about LLMs' underlying values and cultural alignment.
AIBullisharXiv – CS AI · Mar 97/10
🧠Researchers developed new Monte Carlo inference strategies inspired by Bayesian Experimental Design to improve AI agents' information-seeking capabilities. The methods significantly enhanced language models' performance in strategic decision-making tasks, with weaker models like Llama-4-Scout outperforming GPT-5 at 1% of the cost.
🧠 GPT-5🧠 Llama
AIBullisharXiv – CS AI · Mar 97/10
🧠Researchers have developed Hyper++, a new hyperbolic deep reinforcement learning agent that solves optimization challenges in hyperbolic geometry-based RL. The system outperforms previous approaches by 30% in training speed and demonstrates superior performance on benchmark tasks through improved gradient stability and feature regularization.
AIBullisharXiv – CS AI · Mar 97/10
🧠Researchers propose FLoRG, a new federated learning framework for efficiently fine-tuning large language models that reduces communication overhead by up to 2041x while improving accuracy. The method uses Gram matrix aggregation and Procrustes alignment to solve aggregation errors and decomposition drift issues in distributed AI training.
AIBullisharXiv – CS AI · Mar 97/10
🧠Researchers propose Stem, a new sparse attention mechanism for Large Language Models that reduces computational complexity while maintaining accuracy. The method uses position-dependent token selection and output-aware metrics to optimize information flow in causal attention, achieving faster pre-filling with better performance.
AINeutralarXiv – CS AI · Mar 97/10
🧠Researchers demonstrate that traditional explainable AI methods designed for static predictions fail when applied to agentic AI systems that make sequential decisions over time. The study shows attribution-based explanations work well for static tasks but trace-based diagnostics are needed to understand failures in multi-step AI agent behaviors.
AINeutralarXiv – CS AI · Mar 97/10
🧠Researchers developed a method called "Personality Engineering" to create AI models with diverse personality traits through continued pre-training on domain-specific texts. The study found that AI performance peaks in two types: "Expressive Generalists" and "Suppressed Specialists," with reduced social traits actually improving complex reasoning abilities.
AINeutralarXiv – CS AI · Mar 97/10
🧠Researchers introduced LLMTM, a comprehensive benchmark to evaluate Large Language Models' performance on temporal motif analysis in dynamic graphs. The study tested nine different LLMs and developed a structure-aware dispatcher that balances accuracy with cost-effectiveness for graph analysis tasks.
🧠 GPT-4
AIBearisharXiv – CS AI · Mar 97/10
🧠Researchers developed WBC (Window-Based Comparison), a new membership inference attack method that significantly outperforms existing approaches by analyzing localized patterns in Large Language Models rather than global signals. The technique achieves 2-3 times better detection rates and exposes critical privacy vulnerabilities in fine-tuned LLMs through sliding window analysis and binary voting mechanisms.
AIBullisharXiv – CS AI · Mar 97/10
🧠Researchers propose a new method for training large language models (LLMs) that addresses the diversity loss problem in reinforcement learning approaches. Their technique uses the α-divergence family to better balance precision and diversity in reasoning tasks, achieving state-of-the-art performance on theorem-proving benchmarks.
AIBullisharXiv – CS AI · Mar 97/10
🧠Google's Gemini-based AI models, particularly Gemini Deep Think, have demonstrated the ability to collaborate with researchers to solve open problems and generate new proofs across theoretical computer science, economics, optimization, and physics. The research identifies effective techniques for human-AI collaboration including iterative refinement, problem decomposition, and deploying AI as adversarial reviewers to detect flaws in existing proofs.
🧠 Gemini
AIBullisharXiv – CS AI · Mar 97/10
🧠Researchers introduce RM-R1, a new class of Reasoning Reward Models (ReasRMs) that integrate chain-of-thought reasoning into reward modeling for large language models. The models outperform much larger competitors including GPT-4o by up to 4.9% across reward model benchmarks by using a chain-of-rubrics mechanism and two-stage training process.
🧠 GPT-4🧠 Llama
AIBullisharXiv – CS AI · Mar 97/10
🧠Researchers introduce BEVLM, a framework that integrates Large Language Models with Bird's-Eye View representations for autonomous driving. The approach improves LLM reasoning accuracy in cross-view driving scenarios by 46% and enhances end-to-end driving performance by 29% in safety-critical situations.
AIBullisharXiv – CS AI · Mar 97/10
🧠Researchers introduce generative predictive control, a new AI framework that enables robots to perform fast, dynamic tasks without requiring expert demonstrations. The method uses flow matching policies that can handle high-frequency feedback and maintain temporal consistency, addressing key limitations of current robotics approaches.
AINeutralarXiv – CS AI · Mar 97/10
🧠Researchers present a new framework for uncertainty quantification in AI agents, highlighting critical gaps in current research that focuses on single-turn interactions rather than complex multi-step agent deployments. The paper identifies four key technical challenges and proposes foundations for safer AI agent systems in real-world applications.
AIBullisharXiv – CS AI · Mar 97/10
🧠Researchers introduce DataChef-32B, an AI system that uses reinforcement learning to automatically generate optimal data processing recipes for training large language models. The system eliminates the need for manual data curation by automatically designing complete data pipelines, achieving performance comparable to human experts across six benchmark tasks.
AINeutralarXiv – CS AI · Mar 97/10
🧠New research reveals that generative AI creates a paradox where it equalizes individual task performance but may increase aggregate inequality by concentrating economic value in complementary assets. The study presents a formal model showing two inequality regimes dependent on AI's technology structure and labor market institutions.
AIBullisharXiv – CS AI · Mar 97/10
🧠Researchers propose Traversal-as-Policy, a method that distills AI agent execution logs into Gated Behavior Trees (GBTs) to create safer, more efficient autonomous agents. The approach significantly improves success rates while reducing safety violations and computational costs across multiple benchmarks.
AIBullisharXiv – CS AI · Mar 97/10
🧠Google DeepMind introduces Aletheia, an AI research agent powered by Gemini Deep Think that can autonomously conduct mathematical research from problem-solving to generating complete research papers. The system has successfully produced research papers without human intervention and solved four open mathematical problems from established databases.
🏢 Google🧠 Gemini