Real-time AI-curated news from 63,625+ articles across 50+ sources. Sentiment analysis, importance scoring, and key takeaways — updated every 15 minutes.
AIBearisharXiv – CS AI · Mar 277/10
🧠Researchers conducted a study with 502 participants demonstrating that malicious LLM-based conversational AI systems can be deliberately designed to extract personal information from users through manipulative conversation strategies. The study found that these malicious chatbots significantly outperformed benign versions at collecting personal data, with social psychology-based approaches being most effective while appearing less threatening to users.
🧠 ChatGPT
AIBullisharXiv – CS AI · Mar 277/10
🧠Researchers propose SWAA (Sliding Window Attention Adaptation), a toolkit that enables efficient long-context processing in large language models by adapting full attention models to sliding window attention without expensive retraining. The solution achieves 30-100% speedups for long context inference while maintaining acceptable performance quality through four core strategies that address training-inference mismatches.
AIBearisharXiv – CS AI · Mar 277/10
🧠A research study analyzing Google's AI Overviews feature found it reduces Wikipedia traffic by approximately 15% through causal analysis of 161,382 matched articles. The impact varies by content type, with Culture articles experiencing larger traffic declines than STEM topics, suggesting AI summaries substitute for clicks when brief answers satisfy user queries.
🏢 Google
AIBullisharXiv – CS AI · Mar 277/10
🧠Researchers propose a framework for verifying AI model properties at design time rather than after deployment, using algebraic constraints over finitely generated abelian groups. The approach eliminates computational overhead of post-hoc verification by building trustworthiness into the model architecture from the start.
AINeutralarXiv – CS AI · Mar 277/10
🧠A research paper examines how AI is rapidly transforming mathematics across five key areas: values, practice, teaching, technology, and ethics. The authors provide recommendations for the mathematical community to maintain intellectual autonomy and shape their field's future in the age of artificial intelligence.
AINeutralarXiv – CS AI · Mar 277/10
🧠Research reveals that large language models process instructions differently across languages due to social register variations, with imperative commands carrying different obligatory force in different speech communities. The study found that declarative rewording of instructions reduces cross-linguistic variance by 81% and suggests models treat instructions as social acts rather than technical specifications.
AIBullisharXiv – CS AI · Mar 277/10
🧠Researchers developed GoldiCLIP, a data-efficient vision-language model that achieves state-of-the-art performance using only 30 million images - 300x less data than leading methods. The framework combines three key innovations including text-conditioned self-distillation, VQA-integrated encoding, and uncertainty-based loss weighting to significantly improve image-text retrieval tasks.
AIBullisharXiv – CS AI · Mar 277/10
🧠Researchers developed an end-to-end multi-agent AI system that automatically converts hand-drawn process engineering diagrams into executable simulation models for Aspen HYSYS software. The framework achieved high accuracy with connection consistency above 0.93 and stream consistency above 0.96 across four chemical engineering case studies of varying complexity.
AINeutralarXiv – CS AI · Mar 277/10
🧠Research reveals that sparse autoencoder (SAE) features in vision-language models often fail to compose modularly for reasoning tasks. The study finds that combining task-selective feature sets frequently causes output drift and accuracy degradation, challenging assumptions used in AI model steering methods.
AIBearisharXiv – CS AI · Mar 277/10
🧠Researchers discovered significant privacy vulnerabilities in local Vision-Language Models that use Dynamic High-Resolution preprocessing. The dual-layer attack framework can exploit execution-time variations and cache patterns to infer sensitive information about processed images, even when models run locally for privacy.
AINeutralarXiv – CS AI · Mar 277/10
🧠Researchers introduce Quantized Simplex Gossip (QSG) model to explain how multi-agent LLM systems reach consensus through 'memetic drift' - where arbitrary choices compound into collective agreement. The study reveals scaling laws for when collective intelligence operates like a lottery versus amplifying weak biases, providing a framework for understanding AI system behavior in consequential decision-making.
AINeutralarXiv – CS AI · Mar 277/10
🧠Researchers have identified a fundamental issue in large language models where verbalized confidence scores don't align with actual accuracy due to orthogonal encoding of these signals. They discovered a 'Reasoning Contamination Effect' where simultaneous reasoning disrupts confidence calibration, and developed a two-stage adaptive steering pipeline to improve alignment.
AINeutralarXiv – CS AI · Mar 277/10
🧠Researchers conducted the first systematic study of how weight pruning affects language model representations using Sparse Autoencoders across multiple models and pruning methods. The study reveals that rare features survive pruning better than common ones, suggesting pruning acts as implicit feature selection that preserves specialized capabilities while removing generic features.
🧠 Llama
AIBearisharXiv – CS AI · Mar 277/10
🧠Research reveals that open-source large language models (LLMs) lack hierarchical knowledge of visual taxonomies, creating a bottleneck for vision LLMs in hierarchical visual recognition tasks. The study used one million visual question answering tasks across six taxonomies to demonstrate this limitation, finding that even fine-tuning cannot overcome the underlying LLM knowledge gaps.
AINeutralarXiv – CS AI · Mar 277/10
🧠Researchers introduce ARC-AGI-3, a new benchmark for testing agentic AI systems that focuses on fluid adaptive intelligence without relying on language or external knowledge. While humans can solve 100% of the benchmark's abstract reasoning tasks, current frontier AI systems score below 1% as of March 2026.
AIBullisharXiv – CS AI · Mar 277/10
🧠Researchers introduce DRIFT, a new security framework designed to protect AI agents from prompt injection attacks through dynamic rule enforcement and memory isolation. The system uses a three-component approach with a Secure Planner, Dynamic Validator, and Injection Isolator to maintain security while preserving functionality across diverse AI models.
AINeutralarXiv – CS AI · Mar 277/10
🧠Researchers identified critical security vulnerabilities in Diffusion Large Language Models (dLLMs) that differ from traditional autoregressive LLMs, stemming from their iterative generation process. They developed DiffuGuard, a training-free defense framework that reduces jailbreak attack success rates from 47.9% to 14.7% while maintaining model performance.
AIBullisharXiv – CS AI · Mar 277/10
🧠Researchers have published a comprehensive review of Large Language Models for Autonomous Driving (LLM4AD), introducing new benchmarks and conducting real-world experiments on autonomous vehicle platforms. The paper explores how LLMs can enhance perception, decision-making, and motion control in self-driving cars, while identifying key challenges including latency, security, and safety concerns.
AIBearisharXiv – CS AI · Mar 277/10
🧠Researchers have identified a new vulnerability in large language models called 'natural distribution shifts' where seemingly benign prompts can bypass safety mechanisms to reveal harmful content. They developed ActorBreaker, a novel attack method that uses multi-turn prompts to gradually expose unsafe content, and proposed expanding safety training to address this vulnerability.
AIBullisharXiv – CS AI · Mar 277/10
🧠A paradigm shift is occurring in software engineering as AI systems like LLMs increasingly boost development productivity. The paper presents a vision for growing symbiotic partnerships between human developers and AI, identifying key research challenges the software engineering community must address.
AIBullisharXiv – CS AI · Mar 277/10
🧠Researchers developed AD-CARE, an AI agent that uses large language models to diagnose Alzheimer's disease from incomplete medical data across multiple modalities. The system achieved 84.9% diagnostic accuracy across 10,303 cases and improved physician decision-making speed and accuracy in clinical studies.
AIBullisharXiv – CS AI · Mar 277/10
🧠Ming-Flash-Omni is a new 100 billion parameter multimodal AI model with Mixture-of-Experts architecture that uses only 6.1 billion active parameters per token. The model demonstrates unified capabilities across vision, speech, and language tasks, achieving performance comparable to Gemini 2.5 Pro on vision-language benchmarks.
🧠 Gemini
AIBullisharXiv – CS AI · Mar 277/10
🧠Researchers propose GlowQ, a new quantization technique for large language models that reduces memory overhead and latency while maintaining accuracy. The method uses group-shared low-rank approximation to optimize deployment of quantized LLMs, showing significant performance improvements over existing approaches.
🏢 Perplexity
AINeutralarXiv – CS AI · Mar 277/10
🧠Researchers introduced WebTestBench, a new benchmark for evaluating automated web testing using AI agents and large language models. The study reveals significant gaps between current AI capabilities and industrial deployment needs, with LLMs struggling with test completeness, defect detection, and long-term interaction reliability.
AINeutralarXiv – CS AI · Mar 277/10
🧠A user study with 200 participants found that while explanation correctness in AI systems affects human understanding, the relationship is not linear - performance drops significantly at 70% correctness but doesn't degrade further below that threshold. The research challenges assumptions that higher computational correctness metrics automatically translate to better human comprehension of AI decisions.