Real-time AI-curated news from 30,347+ articles across 50+ sources. Sentiment analysis, importance scoring, and key takeaways — updated every 15 minutes.
AIBullisharXiv – CS AI · Mar 177/10
🧠Researchers introduce D-MEM, a biologically-inspired memory architecture for AI agents that uses dopamine-like reward prediction error routing to dramatically reduce computational costs. The system reduces token consumption by over 80% and eliminates quadratic scaling bottlenecks by selectively processing only high-importance information through cognitive restructuring.
AIBullisharXiv – CS AI · Mar 177/10
🧠Researchers introduce FlashHead, a training-free replacement for classification heads in language models that delivers up to 1.75x inference speedup while maintaining accuracy. The innovation addresses a critical bottleneck where classification heads consume up to 60% of model parameters and 50% of inference compute in modern language models.
🧠 Llama
AIBullisharXiv – CS AI · Mar 177/10
🧠ADV-0 is a new closed-loop adversarial training framework for autonomous driving that uses min-max optimization to improve robustness against rare but safety-critical scenarios. The system treats the interaction between driving policy and adversarial agents as a zero-sum game, converging to Nash Equilibrium while maximizing real-world performance bounds.
AINeutralarXiv – CS AI · Mar 177/10
🧠Researchers analyzed 3,550 papers to map the divide between AI Safety (AIS) and AI Ethics (AIE) communities, proposing a 'critical bridging' approach to reconcile tensions. The study identifies four engagement modes and finds overlapping concerns around transparency, reproducibility, and governance despite fundamental differences in approach.
AIBearisharXiv – CS AI · Mar 177/10
🧠Researchers argue that current AI safety assessments using questionnaire-style prompts on language models are inadequate for evaluating real AI agents. The study suggests these methods lack construct validity because LLM responses to hypothetical scenarios don't accurately represent how AI agents would actually behave in real-world deployments.
AIBullisharXiv – CS AI · Mar 177/10
🧠SPARQ introduces a unified framework combining spiking neural networks, quantization-aware training, and reinforcement learning-guided early exits for energy-efficient edge AI. The system achieves up to 5.15% higher accuracy than conventional quantized SNNs while reducing system energy consumption by over 330 times and cutting synaptic operations by over 90%.
AIBullisharXiv – CS AI · Mar 177/10
🧠Researchers have extended the RESTA defense mechanism to vision-language models (VLMs) to protect against jailbreaking attacks that can cause AI systems to produce harmful outputs. The study found that directional embedding noise significantly reduces attack success rates across the JailBreakV-28K benchmark, providing a lightweight security layer for AI agent systems.
AINeutralarXiv – CS AI · Mar 177/10
🧠Researchers identified that medical multimodal large language models (MLLMs) fail primarily due to inadequate visual grounding capabilities when analyzing medical images, unlike their success with natural scenes. They developed VGMED evaluation dataset and proposed VGRefine method, achieving state-of-the-art performance across 6 medical visual question-answering benchmarks without additional training.
AINeutralarXiv – CS AI · Mar 177/10
🧠Researchers propose group-conditional federated conformal prediction (GC-FCP), a new protocol that enables trustworthy AI uncertainty quantification across distributed clients while providing coverage guarantees for specific groups. The framework addresses challenges in federated learning for applications in healthcare, finance, and mobile sensing by creating compact weighted summaries that support efficient calibration.
AINeutralarXiv – CS AI · Mar 177/10
🧠Researchers developed UMID, a new text-only auditing framework to detect if personally identifiable information was memorized during training of multimodal AI models like CLIP and CLAP. The method significantly improves efficiency and effectiveness of membership inference attacks while maintaining privacy constraints.
AINeutralarXiv – CS AI · Mar 177/10
🧠Researchers demonstrate that current audio deepfake detection systems incorrectly classify legitimate speech processing technologies like voice conversion and restoration as fake audio. A new multi-class detection approach shows improved accuracy by distinguishing between authentic speech, benign modifications, and actual spoofing attempts.
AIBullisharXiv – CS AI · Mar 177/10
🧠Researchers developed new methods for extracting symbolic formulas from Kolmogorov-Arnold Networks (KANs), addressing a key bottleneck in making AI models more interpretable. The proposed Greedy in-context Symbolic Regression (GSR) and Gated Matching Pursuit (GMP) methods achieved up to 99.8% reduction in test error while improving robustness.
AIBearisharXiv – CS AI · Mar 177/10
🧠A comprehensive study of 19 large language models reveals systematic racial bias in automated text annotation, with over 4 million judgments showing LLMs consistently reproduce harmful stereotypes based on names and dialect. The research demonstrates that AI models rate texts with Black-associated names as more aggressive and those written in African American Vernacular English as less professional and more toxic.
AIBearisharXiv – CS AI · Mar 177/10
🧠Researchers developed SWhisper, a framework that uses near-ultrasonic audio to deliver covert jailbreak attacks against speech-driven AI systems. The technique is inaudible to humans but can successfully bypass AI safety measures with up to 94% effectiveness on commercial models.
AIBullisharXiv – CS AI · Mar 177/10
🧠Researchers introduce APEX-Searcher, a new framework that enhances large language models' search capabilities through a two-stage approach combining reinforcement learning for strategic planning and supervised fine-tuning for execution. The system addresses limitations in multi-hop question answering by decoupling retrieval processes into planning and execution phases, showing significant improvements across multiple benchmarks.
AIBullisharXiv – CS AI · Mar 177/10
🧠Researchers developed AD-Copilot, a specialized multimodal AI assistant for industrial anomaly detection that outperforms existing models and even human experts. The system uses a novel visual comparison approach and achieved 82.3% accuracy on benchmarks, representing up to 3.35x improvement over baselines.
🏢 Microsoft
AIBullisharXiv – CS AI · Mar 177/10
🧠Researchers propose PaIR-Drive, a new parallel framework that combines imitation learning and reinforcement learning for autonomous driving, achieving 91.2 PDMS performance on NAVSIMv1 benchmark. The approach addresses limitations of sequential fine-tuning by running IL and RL in parallel branches, enabling better performance than existing methods.
AIBullisharXiv – CS AI · Mar 177/10
🧠Researchers have developed UniVid, a new pyramid diffusion model that unifies text-to-video and image-to-video generation into a single system. The model uses dual-stream cross-attention mechanisms to process both text prompts and reference images, achieving superior temporal coherence across different video generation tasks.
AIBearisharXiv – CS AI · Mar 177/10
🧠Researchers introduce τ-voice, a new benchmark for evaluating full-duplex voice AI agents on complex real-world tasks. The study reveals significant performance gaps, with voice agents achieving only 30-45% of text-based AI capability under realistic conditions with noise and diverse accents.
🧠 GPT-5
AIBearisharXiv – CS AI · Mar 177/10
🧠A comprehensive study of six major LLM families reveals systematic biases in moral judgments based on gender pronouns and grammatical markers. The research found that AI models consistently favor non-binary subjects while penalizing male subjects in fairness assessments, raising concerns about embedded biases in AI ethical decision-making.
🏢 Meta🧠 Grok
AIBearisharXiv – CS AI · Mar 177/10
🧠Researchers introduce EvoClaw, a new benchmark that evaluates AI agents on continuous software evolution rather than isolated coding tasks. The study reveals a critical performance drop from >80% on isolated tasks to at most 38% in continuous settings across 12 frontier models, highlighting AI agents' struggle with long-term software maintenance.
AIBullisharXiv – CS AI · Mar 177/10
🧠Researchers developed a two-agent defense system called OpenClaw that achieved 0% attack success rate against prompt injection attacks on LLM applications. The system uses agent isolation and JSON formatting to structurally prevent malicious prompts from reaching action-taking agents.
AIBullisharXiv – CS AI · Mar 177/10
🧠Researchers developed a new framework to remove backdoors from large language models without prior knowledge of triggers or clean reference models. The method uses an immunization-inspired approach that creates synthetic backdoored variants to identify and neutralize malicious components while preserving the model's generative capabilities.
AINeutralarXiv – CS AI · Mar 177/10
🧠Researchers developed Prefix-Shared KV Cache (PSKV), a new technique that accelerates jailbreak attacks on Large Language Models by 40% while reducing memory usage by 50%. The method optimizes the red-teaming process by sharing cached prefixes across multiple attack attempts, enabling more efficient parallel inference without compromising attack success rates.
AIBullisharXiv – CS AI · Mar 177/10
🧠Researchers introduce Prism, a new self-evolving AI reasoning system that prevents diversity collapse in problem generation by maintaining semantic coverage across mathematical problem spaces. The system achieved significant accuracy improvements over existing methods on mathematical reasoning benchmarks and generated 100k diverse mathematical questions.