🧠

AI

22,940 AI articles curated from 50+ sources with AI-powered sentiment analysis, importance scoring, and key takeaways.

22940 articles

AIBearisharXiv – CS AI · Jun 27/10

🧠

Detector-Evasive LLM Paraphrasing via Constrained Policy Optimization

Researchers present DEPO, a reinforcement learning algorithm that enables large language models to evade AI-text detectors through paraphrasing while maintaining semantic fidelity. The constrained optimization approach treats detector evasion as the primary objective with semantic preservation as an explicit constraint, demonstrating robust performance across multiple detectors and datasets.

AIBullisharXiv – CS AI · Jun 27/10

🧠

Real2SAM2Real: Generative 3D Caches as Complementary Context for Video Diffusion

Researchers introduce Real2SAM2Real, a framework that enhances Video Diffusion Models by incorporating explicit 3D geometric caches extracted from SAM3D models, enabling more precise control over camera movements and scene dynamics while maintaining structural consistency in complex occlusions and high-motion scenarios.

AIBearisharXiv – CS AI · Jun 27/10

🧠

Digital-to-Physical Transfer of Adversarial Patches for Aerial Vehicle Detection

Researchers demonstrate that adversarial patches—printable patterns designed to fool AI object detectors—can be physically deployed against aerial vehicle detection systems with significant effectiveness. The study reveals that patches placed directly on vehicles outperform digitally-optimized designs in real-world conditions, exposing critical vulnerabilities in deep neural network-based detection systems used for surveillance and monitoring applications.

AIBullisharXiv – CS AI · Jun 27/10

🧠

CRISP -- Clustering-Based Redundancy-Reduced Instance Sampling for Pathology Case Representation and Retrieval

CRISP is an unsupervised machine learning framework that automates the analysis of multiple whole-slide images (WSIs) in digital pathology by selectively sampling informative patches across all slides in a case rather than relying on a single pathologist-selected slide. The approach matches or exceeds current clinical practice for breast cancer diagnosis and retrieval while eliminating subjective slide selection and reducing computational burden.

AIBearisharXiv – CS AI · Jun 27/10

🧠

Can Vision Models Truly Forget? Mirage: Representation-Level Certification of Visual Unlearning

Researchers introduce Mirage, a representation-level auditing framework that reveals existing machine unlearning methods in federated learning fail to truly forget sensitive data despite passing output-level tests. The study demonstrates that current approaches retain substantial class structure in internal representations, exposing a critical gap between certification standards and actual data privacy.

AIBullisharXiv – CS AI · Jun 27/10

🧠

Continuous Reasoning for Vision-Language-Action

Researchers propose Continuous Reasoning for Vision-Language-Action (VLA), a framework that uses shared Gaussian latent representations instead of discrete tokens to enable robotic control. The approach achieves 40.4% improvement on robotic manipulation tasks, suggesting that effective AI reasoning for physical control requires verifiable, shareable internal representations rather than explicit language.

AIBullisharXiv – CS AI · Jun 27/10

🧠

A Protocol-Language Model for Network Intrusion (Without Deep Packet Inspection)

Researchers present PLM-NIDS, a machine learning system that detects network intrusions by analyzing packet metadata patterns rather than encrypted payload content, achieving 97.7% precision without requiring access to encrypted traffic. The approach uses a RWKV state-space model to learn the 'grammar' of benign network behavior, identifying attacks as statistical deviations from normal flow patterns.

🏢 Perplexity

AIBullisharXiv – CS AI · Jun 27/10

🧠

RAFT: Data Refinement and Adaptive Distillation for Domain Fine-Tuning with Alleviated Forgetting

Researchers introduce RAFT, a framework addressing the problem of catastrophic forgetting in domain-specific fine-tuning of language models. By combining data refinement with answer-conditioned distillation, RAFT achieves 23.2% improvement in domain accuracy while recovering 10-18% of general capability losses typically incurred during fine-tuning.

AINeutralarXiv – CS AI · Jun 27/10

🧠

StemBind: When MLLMs Get Lost Between Rules and Instances in Abstract Visual Reasoning

Researchers introduce StemBind, a diagnostic benchmark revealing that multimodal large language models can identify visual patterns and rules but frequently fail at the final step of matching answers to those rules. Across 24 frontier models tested on 19,533 tasks, the study identifies rule-to-instance binding (mapping abstract rules to specific visual examples) as the critical bottleneck, a failure point that neither scaling nor chain-of-thought prompting reliably resolves.

AIBullisharXiv – CS AI · Jun 27/10

🧠

BudgetDraft: Acceptance-Aware Multi-View Training for Sparse-KV Speculative Decoding

BudgetDraft is a new training method for sparse-KV speculative decoding that enables faster language model inference under memory constraints. By training drafters to handle multiple KV cache budgets simultaneously, the technique achieves up to 6.55x speedup on mid-to-long context inference while maintaining acceptance rates and reducing GPU memory usage.

AIBullisharXiv – CS AI · Jun 27/10

🧠

AgentxGCore: Agentic AI for Next-Generation Mobile Core Network

AgentxGCore proposes an AI-native architecture for next-generation mobile core networks (6G) using multi-agent systems that enable autonomous network optimization and management. The framework combines agentic AI with intent-based networking to replace centralized network management with self-organizing, self-adapting systems that leverage large language models for real-time decision-making.

AIBearisharXiv – CS AI · Jun 27/10

🧠

Persona Attack: Incremental Memory Injection Jailbreak Attack against Large Language Models

Researchers have identified a new jailbreak attack called Persona Attack that exploits LLMs' memory and conversation context to bypass safety mechanisms. By incrementally injecting instructions through dialogue, the attack achieves up to 95% success rates, demonstrating that accumulated memory instructions can override built-in safety alignment regardless of traditional safety training.

AINeutralarXiv – CS AI · Jun 27/10

🧠

On Effectiveness and Efficiency of Agentic Tool-calling and RL Training

A new research paper identifies critical inconsistencies in how tool-calling capabilities are evaluated across LLM agents, showing that minor implementation choices significantly affect benchmark results. The authors propose two optimization techniques that accelerate reinforcement learning-based tool-calling training while maintaining performance levels.

AIBullisharXiv – CS AI · Jun 27/10

🧠

Lying Is Just a Phase: The Hidden Alignment Transition in Language Model Scaling

Researchers discover that language models exhibit a phase transition between reasoning and truthfulness capabilities at around 3.5B parameters, where smaller models show anticorrelated capabilities while larger ones show cooperation. This hidden alignment transition is invisible to standard loss curves but can be diagnosed from public benchmarks alone, and a proof-of-concept intervention demonstrates that adding a truth-direction vector can correct misaligned outputs without retraining.

🧠 Llama

AIBearisharXiv – CS AI · Jun 27/10

🧠

PrivacyPeek: Auditing What LLM-Based Agents Acquire, Not Just What They Say

PrivacyPeek introduces a new benchmark for evaluating privacy vulnerabilities in LLM-based agents, revealing that autonomous AI systems routinely acquire sensitive information beyond what tasks require. The research demonstrates that existing privacy audits miss critical acquisition-stage leakage, where data enters the agent's context, and that current prompt-level defenses are largely ineffective.

AINeutralarXiv – CS AI · Jun 27/10

🧠

Generative AI and Digital Ecosystem Resilience: A Proactive Lifecycle-Based Survey

A comprehensive survey examines how generative AI has accelerated adversarial synthetic content creation, necessitating a shift from reactive to proactive detection methods. Using the C5 Interaction Model framework, researchers integrate machine learning with social science approaches to detect coordinated inauthentic behavior, synthetic narrative propagation, and emerging threats across information ecosystems.

AIBullisharXiv – CS AI · Jun 27/10

🧠

Diffusion Image Generation with Explicit Modeling of Data Manifold Geometry

Researchers introduce MIND (Data Manifold-aware Image diffusioN moDel), a novel diffusion-based image generation framework that combines discrete patch tokenization with continuous diffusion modeling. The approach achieves significant performance improvements, reducing FID scores to 2.06 on ImageNet-256×256 with guidance using only 130M parameters, substantially outperforming larger baseline models.

AIBullisharXiv – CS AI · Jun 27/10

🧠

CoilDrop-MRI: Self-supervised physics-guided MRI reconstruction with coil dropout

Researchers introduce CoilDrop-MRI, a self-supervised deep learning method that improves accelerated MRI reconstruction by strategically dropping data across receiver coils rather than only in k-space. Validated across multiple hospital sites and field strengths, the approach matches supervised methods' quality without requiring fully sampled training data, offering practical efficiency gains for medical imaging.

AIBullisharXiv – CS AI · Jun 27/10

🧠

DLLM-JEPA: Joint Embedding Predictive Architectures for Masked Diffusion Language Models

Researchers introduce DLLM-JEPA, a new self-supervised learning approach that combines Joint Embedding Predictive Architectures with masked-diffusion language models. The method eliminates the need for explicit multi-view training data and reduces computational costs by 33% compared to prior LLM-JEPA while achieving significant performance improvements across multiple benchmarks.

AIBearisharXiv – CS AI · Jun 27/10

🧠

CardioLens: Revealing the Clinical Reality Gap of MLLMs via Multi-Sequence Cardiac MRI Evaluations

Researchers introduce CardioLens, a rigorous evaluation framework revealing that state-of-the-art multimodal large language models (MLLMs) perform poorly at clinical cardiac MRI interpretation despite strong public benchmark results. The study demonstrates a significant gap between theoretical capabilities and real-world clinical applicability, with models failing to integrate distributed evidence across imaging sequences and temporal phases.

AIBullisharXiv – CS AI · Jun 27/10

🧠

Project SPARROW and the Future of Conservation Technology

SPARROW is an open-source hardware-software platform that combines solar power, edge AI, and satellite connectivity to enable autonomous biodiversity monitoring in remote ecosystems. Deployed across four continents, the system collected over 2 million images and recordings in 190 days while operating continuously without human intervention, establishing a foundation for distributed ecological monitoring networks.

AINeutralarXiv – CS AI · Jun 27/10

🧠

Diagnosing LLM Arbitration Behavior over Pre-evidence Epistemic States in RAG-based Fact-Checking

Researchers introduce PAVE, a diagnostic framework for evaluating how large language models arbitrate between their parametric knowledge and retrieved evidence in RAG-based fact-checking systems. Testing across seven LLMs reveals inconsistent and model-dependent behavior when prior knowledge conflicts with retrieved context, prompting the development of a lightweight test-time correction method to improve factual reliability.

AIBullisharXiv – CS AI · Jun 27/10

🧠

AI-PROPELLER: Warehouse-Scale Interprocedural Code Layout Optimization with AlphaEvolve

AI-PROPELLER introduces the first warehouse-scale interprocedural code layout optimization system, using an evolutionary AI workflow to improve binary performance by 0.23-1.6% beyond existing post-link optimizers. This breakthrough applies machine learning to compiler optimization in industrial production environments, achieving measurable real-world performance gains.

AIBearisharXiv – CS AI · Jun 27/10

🧠

Silent Failures in Physical AI: A Literature Review of Runtime Action Authorization for Autonomous Systems

A literature review identifies a critical safety gap in Physical AI systems—autonomous robots, drones, and vehicles that make physically consequential decisions based on visual and language inputs. The research reveals that existing safety mechanisms from AI content moderation and robotics operate independently, leaving no unified runtime authorization system to prevent silent failures where confident but incorrect model outputs cause real-world harm before hardware safeguards activate.

AIBullisharXiv – CS AI · Jun 27/10

🧠

From Human Videos to Robot Manipulation: A Survey on Scalable Vision-Language-Action Learning with Human-Centric Data

A comprehensive survey examines how human videos can be leveraged to train Vision-Language-Action (VLA) models for robot manipulation, addressing the limitation that robot demonstrations are expensive and embodiment-specific. The research categorizes four approaches for extracting actionable knowledge from human videos and identifies critical open challenges in video structuring, embodiment transfer, and real-world evaluation.

← PrevPage 81 of 918Next →