Real-time AI-curated news from 30,338+ articles across 50+ sources. Sentiment analysis, importance scoring, and key takeaways — updated every 15 minutes.
AIBullisharXiv – CS AI · Mar 177/10
🧠Researchers propose RESQ, a three-stage framework that enhances both security and reliability of quantized deep neural networks through specialized fine-tuning techniques. The framework demonstrates up to 10.35% improvement in attack resilience and 12.47% in fault resilience while maintaining competitive accuracy across multiple neural network architectures.
AIBearisharXiv – CS AI · Mar 177/10
🧠Researchers discovered that test-time reinforcement learning (TTRL) methods used to improve AI reasoning capabilities are vulnerable to harmful prompt injections that amplify both safety and harmfulness behaviors. The study shows these methods can be exploited through specially designed 'HarmInject' prompts, leading to reasoning degradation while highlighting the need for safer AI training approaches.
AIBullisharXiv – CS AI · Mar 177/10
🧠Researchers developed SFCoT (Safer Chain-of-Thought), a new framework that monitors and corrects AI reasoning steps in real-time to prevent jailbreak attacks. The system reduced attack success rates from 58.97% to 12.31% while maintaining general AI performance, addressing a critical vulnerability in current large language models.
AIBearisharXiv – CS AI · Mar 177/10
🧠Researchers developed a novel framework for generating adversarial patches that can fool facial recognition systems through both evasion and impersonation attacks. The method reduces facial recognition accuracy from 90% to 0.4% in white-box settings and demonstrates strong cross-model generalization, highlighting critical vulnerabilities in surveillance systems.
AINeutralarXiv – CS AI · Mar 177/10
🧠Researchers have introduced TrinityGuard, a comprehensive safety evaluation and monitoring framework for LLM-based multi-agent systems (MAS) that addresses emerging security risks beyond single agents. The framework identifies 20 risk types across three tiers and provides both pre-development evaluation and runtime monitoring capabilities.
AIBullisharXiv – CS AI · Mar 177/10
🧠Researchers at NVIDIA developed NEMOTRON-CROSSTHINK, a new AI framework that uses reinforcement learning with multi-domain data to improve language model reasoning across diverse fields beyond just mathematics. The system shows significant performance improvements on both mathematical and non-mathematical reasoning benchmarks while using 28% fewer tokens for correct answers.
AIBullisharXiv – CS AI · Mar 177/10
🧠Researchers introduce BevAD, a new lightweight end-to-end autonomous driving architecture that achieves 72.7% success rate on the Bench2Drive benchmark. The study systematically analyzes architectural patterns in closed-loop driving performance, revealing limitations of open-loop dataset approaches and demonstrating strong data-scaling behavior through pure imitation learning.
AIBullisharXiv – CS AI · Mar 177/10
🧠Researchers developed new methods for extracting symbolic formulas from Kolmogorov-Arnold Networks (KANs), addressing a key bottleneck in making AI models more interpretable. The proposed Greedy in-context Symbolic Regression (GSR) and Gated Matching Pursuit (GMP) methods achieved up to 99.8% reduction in test error while improving robustness.
AIBullisharXiv – CS AI · Mar 177/10
🧠Researchers have extended the RESTA defense mechanism to vision-language models (VLMs) to protect against jailbreaking attacks that can cause AI systems to produce harmful outputs. The study found that directional embedding noise significantly reduces attack success rates across the JailBreakV-28K benchmark, providing a lightweight security layer for AI agent systems.
AIBullisharXiv – CS AI · Mar 177/10
🧠ADV-0 is a new closed-loop adversarial training framework for autonomous driving that uses min-max optimization to improve robustness against rare but safety-critical scenarios. The system treats the interaction between driving policy and adversarial agents as a zero-sum game, converging to Nash Equilibrium while maximizing real-world performance bounds.
AINeutralarXiv – CS AI · Mar 177/10
🧠Researchers introduce CCTU, a new benchmark for evaluating large language models' ability to use tools under complex constraints. The study reveals that even state-of-the-art LLMs achieve less than 20% task completion rates when strict constraint adherence is required, with models violating constraints in over 50% of cases.
AIBullisharXiv – CS AI · Mar 177/10
🧠Researchers developed RieMind, a new AI framework that improves spatial reasoning in indoor scenes by 16-50% by separating visual perception from logical reasoning using explicit 3D scene graphs. The system grounds language models in structured geometric representations rather than processing videos end-to-end, achieving significantly better performance on spatial understanding benchmarks.
AIBullisharXiv – CS AI · Mar 177/10
🧠Researchers introduce directional routing, a lightweight mechanism for transformer models that adds only 3.9% parameter cost but significantly improves performance. The technique gives attention heads learned suppression directions controlled by a shared router, reducing perplexity by 31-56% and becoming the dominant computational pathway in the model.
🏢 Perplexity
AIBullisharXiv – CS AI · Mar 177/10
🧠Researchers developed FairMed-XGB, a machine learning framework that reduces gender bias in healthcare AI models by 40-72% while maintaining predictive accuracy. The system uses Bayesian optimization and explainable AI to ensure equitable treatment decisions in critical care settings.
AINeutralarXiv – CS AI · Mar 177/10
🧠Researchers applied Signal Detection Theory to analyze three large language models across 168,000 trials, finding that temperature parameter changes both sensitivity and response bias simultaneously. The study reveals that traditional calibration metrics miss important diagnostic information that SDT's full parametric framework can provide.
AIBullisharXiv – CS AI · Mar 177/10
🧠Researchers propose p²RAG, a new privacy-preserving Retrieval-Augmented Generation system that supports arbitrary top-k retrieval while being 3-300x faster than existing solutions. The system uses an interactive bisection method instead of sorting and employs secret sharing across two servers to protect user prompts and database content.
$RAG
AIBullisharXiv – CS AI · Mar 177/10
🧠Researchers propose ATFS, a new framework that provides universal defense against multiple generative AI architectures simultaneously, overcoming limitations of current defense mechanisms that only work against specific AI models. The system achieves over 90% protection effectiveness within 40 iterations and works across different generative models including Diffusion Models, GANs, and VQ-VAE.
AINeutralarXiv – CS AI · Mar 177/10
🧠Research comparing 200 humans and 95 AI detectors found humans significantly outperform AI at detecting deepfakes, especially in low-quality mobile phone videos where AI accuracy drops to near chance levels. The study reveals human-AI hybrid systems are most effective, as humans and AI make complementary errors in deepfake detection.
AIBearisharXiv – CS AI · Mar 177/10
🧠Research reveals that larger language models become increasingly better at concealing harmful knowledge, making detection nearly impossible for models exceeding 70 billion parameters. Classifiers that can detect knowledge concealment in smaller models fail to generalize across different architectures and scales, exposing critical limitations in AI safety auditing methods.
AIBullisharXiv – CS AI · Mar 177/10
🧠Researchers introduce POLCA (Prioritized Optimization with Local Contextual Aggregation), a new framework that uses large language models as optimizers for complex systems like AI agents and code generation. The method addresses stochastic optimization challenges through priority queuing and meta-learning, demonstrating superior performance across multiple benchmarks including agent optimization and CUDA kernel generation.
AIBullisharXiv – CS AI · Mar 177/10
🧠Researchers introduce EARCP, a new ensemble architecture for AI that dynamically weights different expert models based on performance and coherence. The system provides theoretical guarantees with sublinear regret bounds and has been tested on time series forecasting, activity recognition, and financial prediction tasks.
AIBullisharXiv – CS AI · Mar 177/10
🧠Researchers introduced PriCoder, a new approach that improves Large Language Models' ability to generate code using private library APIs by over 20%. The method uses automatically synthesized training data through graph-based operators to teach LLMs private library usage, addressing a key limitation in current AI coding capabilities.
AIBullisharXiv – CS AI · Mar 177/10
🧠Researchers propose HO-SFL (Hybrid-Order Split Federated Learning), a new framework that enables memory-efficient fine-tuning of large AI models on edge devices by eliminating backpropagation on client devices while maintaining convergence speed comparable to traditional methods. The approach significantly reduces communication costs and memory requirements for distributed AI training.
AINeutralarXiv – CS AI · Mar 177/10
🧠Researchers analyzed 3,550 papers to map the divide between AI Safety (AIS) and AI Ethics (AIE) communities, proposing a 'critical bridging' approach to reconcile tensions. The study identifies four engagement modes and finds overlapping concerns around transparency, reproducibility, and governance despite fundamental differences in approach.
AIBearisharXiv – CS AI · Mar 177/10
🧠Researchers argue that current AI safety assessments using questionnaire-style prompts on language models are inadequate for evaluating real AI agents. The study suggests these methods lack construct validity because LLM responses to hypothetical scenarios don't accurately represent how AI agents would actually behave in real-world deployments.