Models, papers, tools. 16,006 articles with AI-powered sentiment analysis and key takeaways.
AI × CryptoBearishcrypto.news · Mar 277/10
🤖The article discusses the rising threat of AI-powered crypto scams and fake exchanges that exploit user urgency and poor verification practices. It highlights how easily fraudulent crypto platforms can mimic legitimate exchanges to drain user funds.
AIBearisharXiv – CS AI · Mar 277/10
🧠Researchers have identified a new vulnerability in large language models called 'natural distribution shifts' where seemingly benign prompts can bypass safety mechanisms to reveal harmful content. They developed ActorBreaker, a novel attack method that uses multi-turn prompts to gradually expose unsafe content, and proposed expanding safety training to address this vulnerability.
AIBullisharXiv – CS AI · Mar 277/10
🧠Researchers have published a comprehensive review of Large Language Models for Autonomous Driving (LLM4AD), introducing new benchmarks and conducting real-world experiments on autonomous vehicle platforms. The paper explores how LLMs can enhance perception, decision-making, and motion control in self-driving cars, while identifying key challenges including latency, security, and safety concerns.
AIBearisharXiv – CS AI · Mar 277/10
🧠Research reveals that open-source large language models (LLMs) lack hierarchical knowledge of visual taxonomies, creating a bottleneck for vision LLMs in hierarchical visual recognition tasks. The study used one million visual question answering tasks across six taxonomies to demonstrate this limitation, finding that even fine-tuning cannot overcome the underlying LLM knowledge gaps.
AIBullisharXiv – CS AI · Mar 277/10
🧠A paradigm shift is occurring in software engineering as AI systems like LLMs increasingly boost development productivity. The paper presents a vision for growing symbiotic partnerships between human developers and AI, identifying key research challenges the software engineering community must address.
AIBullisharXiv – CS AI · Mar 277/10
🧠Researchers introduce DRIFT, a new security framework designed to protect AI agents from prompt injection attacks through dynamic rule enforcement and memory isolation. The system uses a three-component approach with a Secure Planner, Dynamic Validator, and Injection Isolator to maintain security while preserving functionality across diverse AI models.
AIBearisharXiv – CS AI · Mar 277/10
🧠Researchers discovered significant privacy vulnerabilities in local Vision-Language Models that use Dynamic High-Resolution preprocessing. The dual-layer attack framework can exploit execution-time variations and cache patterns to infer sensitive information about processed images, even when models run locally for privacy.
AINeutralarXiv – CS AI · Mar 277/10
🧠Researchers conducted the first systematic study of how weight pruning affects language model representations using Sparse Autoencoders across multiple models and pruning methods. The study reveals that rare features survive pruning better than common ones, suggesting pruning acts as implicit feature selection that preserves specialized capabilities while removing generic features.
🧠 Llama
AIBullisharXiv – CS AI · Mar 277/10
🧠Researchers developed AD-CARE, an AI agent that uses large language models to diagnose Alzheimer's disease from incomplete medical data across multiple modalities. The system achieved 84.9% diagnostic accuracy across 10,303 cases and improved physician decision-making speed and accuracy in clinical studies.
AIBullisharXiv – CS AI · Mar 277/10
🧠Researchers propose GlowQ, a new quantization technique for large language models that reduces memory overhead and latency while maintaining accuracy. The method uses group-shared low-rank approximation to optimize deployment of quantized LLMs, showing significant performance improvements over existing approaches.
🏢 Perplexity
AIBullisharXiv – CS AI · Mar 277/10
🧠Researchers propose a framework for verifying AI model properties at design time rather than after deployment, using algebraic constraints over finitely generated abelian groups. The approach eliminates computational overhead of post-hoc verification by building trustworthiness into the model architecture from the start.
AINeutralarXiv – CS AI · Mar 277/10
🧠Researchers introduce CRAFT, a multi-agent benchmark that evaluates how well large language models coordinate through natural language communication under partial information constraints. The study finds that stronger reasoning abilities don't reliably translate to better coordination, with smaller open-weight models often matching or outperforming frontier systems in collaborative tasks.
AINeutralarXiv – CS AI · Mar 277/10
🧠A user study with 200 participants found that while explanation correctness in AI systems affects human understanding, the relationship is not linear - performance drops significantly at 70% correctness but doesn't degrade further below that threshold. The research challenges assumptions that higher computational correctness metrics automatically translate to better human comprehension of AI decisions.
AINeutralarXiv – CS AI · Mar 277/10
🧠Researchers introduce ARC-AGI-3, a new benchmark for testing agentic AI systems that focuses on fluid adaptive intelligence without relying on language or external knowledge. While humans can solve 100% of the benchmark's abstract reasoning tasks, current frontier AI systems score below 1% as of March 2026.
AIBullisharXiv – CS AI · Mar 277/10
🧠Researchers introduce the Wireless World Model (WWM), a multi-modal AI framework for 6G networks that predicts wireless channel evolution by understanding electromagnetic wave propagation through 3D geometry. The model demonstrates superior performance across five downstream tasks and real-world measurements, outperforming existing foundation models.
AINeutralarXiv – CS AI · Mar 277/10
🧠Researchers introduced WebTestBench, a new benchmark for evaluating automated web testing using AI agents and large language models. The study reveals significant gaps between current AI capabilities and industrial deployment needs, with LLMs struggling with test completeness, defect detection, and long-term interaction reliability.
AIBullisharXiv – CS AI · Mar 277/10
🧠Researchers propose HIVE, a new framework for training large language models more efficiently in reinforcement learning by selecting high-utility prompts before rollout. The method uses historical reward data and prompt entropy to identify the 'learning edge' where models learn most effectively, significantly reducing computational overhead without performance loss.
AIBearisharXiv – CS AI · Mar 277/10
🧠Researchers have developed PIDP-Attack, a new cybersecurity threat that combines prompt injection with database poisoning to manipulate AI responses in Retrieval-Augmented Generation (RAG) systems. The attack method demonstrated 4-16% higher success rates than existing techniques across multiple benchmark datasets and eight different large language models.
AINeutralarXiv – CS AI · Mar 277/10
🧠Research reveals that large language models process instructions differently across languages due to social register variations, with imperative commands carrying different obligatory force in different speech communities. The study found that declarative rewording of instructions reduces cross-linguistic variance by 81% and suggests models treat instructions as social acts rather than technical specifications.
AINeutralarXiv – CS AI · Mar 277/10
🧠Researchers introduce Quantized Simplex Gossip (QSG) model to explain how multi-agent LLM systems reach consensus through 'memetic drift' - where arbitrary choices compound into collective agreement. The study reveals scaling laws for when collective intelligence operates like a lottery versus amplifying weak biases, providing a framework for understanding AI system behavior in consequential decision-making.
AINeutralarXiv – CS AI · Mar 277/10
🧠Researchers have identified a fundamental issue in large language models where verbalized confidence scores don't align with actual accuracy due to orthogonal encoding of these signals. They discovered a 'Reasoning Contamination Effect' where simultaneous reasoning disrupts confidence calibration, and developed a two-stage adaptive steering pipeline to improve alignment.
AIBearisharXiv – CS AI · Mar 277/10
🧠Researchers introduced CPGBench, a benchmark evaluating how well Large Language Models detect and follow clinical practice guidelines in healthcare conversations. The study found that while LLMs can detect 71-90% of clinical recommendations, they only adhere to guidelines 22-63% of the time, revealing significant gaps for safe medical deployment.
AIBearisharXiv – CS AI · Mar 277/10
🧠Research reveals that LLM system prompt configuration creates massive security vulnerabilities, with the same model's phishing detection rates ranging from 1% to 97% based solely on prompt design. The study PhishNChips demonstrates that more specific prompts can paradoxically weaken AI security by replacing robust multi-signal reasoning with exploitable single-signal dependencies.
AIBearisharXiv – CS AI · Mar 277/10
🧠Researchers have identified a new attack vector called Epistemic Bias Injection (EBI) that manipulates AI language models by injecting factually correct but biased content into retrieval-augmented generation databases. The attack steers model outputs toward specific viewpoints while evading traditional detection methods, though a new defense mechanism called BiasDef shows promise in mitigating these threats.
AINeutralarXiv – CS AI · Mar 277/10
🧠Researchers propose a unified framework for AI security threats that categorizes attacks based on four directional interactions between data and models. The comprehensive taxonomy addresses vulnerabilities in foundation models through four categories: data-to-data, data-to-model, model-to-data, and model-to-model attacks.