🧠

AI

21,049 AI articles curated from 50+ sources with AI-powered sentiment analysis, importance scoring, and key takeaways.

21049 articles

AIBullisharXiv – CS AI · Mar 176/10

🧠

CLAG: Adaptive Memory Organization via Agent-Driven Clustering for Small Language Model Agents

Researchers introduce CLAG, a clustering-based memory framework that helps small language model agents organize and retrieve information more effectively. The system addresses memory dilution issues by creating semantic clusters with automated profiles, showing improved performance across multiple QA datasets.

AIBullisharXiv – CS AI · Mar 176/10

🧠

MA-VLCM: A Vision Language Critic Model for Value Estimation of Policies in Multi-Agent Team Settings

Researchers propose MA-VLCM, a framework that uses pretrained vision-language models as centralized critics in multi-agent reinforcement learning instead of learning critics from scratch. This approach significantly improves sample efficiency and enables zero-shot generalization while producing compact policies suitable for resource-constrained robots.

AINeutralarXiv – CS AI · Mar 176/10

🧠

A Closer Look into LLMs for Table Understanding

Researchers conducted an empirical study on 16 Large Language Models to understand how they process tabular data, revealing a three-phase attention pattern and finding that tabular tasks require deeper neural network layers than math reasoning. The study analyzed attention dynamics, layer depth requirements, expert activation in MoE models, and the impact of different input designs on table understanding performance.

AIBullisharXiv – CS AI · Mar 176/10

🧠

GradCFA: A Hybrid Gradient-Based Counterfactual and Feature Attribution Explanation Algorithm for Local Interpretation of Neural Networks

Researchers introduce GradCFA, a new hybrid AI explanation framework that combines counterfactual explanations and feature attribution to improve transparency in neural network decisions. The algorithm extends beyond binary classification to multi-class scenarios and demonstrates superior performance in generating feasible, plausible, and diverse explanations compared to existing methods.

AIBullisharXiv – CS AI · Mar 176/10

🧠

CATFormer: When Continual Learning Meets Spiking Transformers With Dynamic Thresholds

Researchers introduce CATFormer, a new spiking neural network architecture that solves catastrophic forgetting in continual learning through dynamic threshold neurons. The framework uses context-adaptive thresholds and task-agnostic inference to maintain knowledge across multiple learning tasks without performance degradation.

AIBullisharXiv – CS AI · Mar 176/10

🧠

Thinking in Latents: Adaptive Anchor Refinement for Implicit Reasoning in LLMs

Researchers introduce AdaAnchor, a new AI reasoning framework that performs silent computation in latent space rather than generating verbose step-by-step reasoning. The system adaptively determines when to stop refining its internal reasoning process, achieving up to 5% better accuracy while reducing token generation by 92-93% and cutting refinement steps by 48-60%.

AIBullisharXiv – CS AI · Mar 176/10

🧠

Two Birds, One Projection: Harmonizing Safety and Utility in LVLMs via Inference-time Feature Projection

Researchers propose 'Two Birds, One Projection,' a new inference-time defense method for Large Vision-Language Models that simultaneously improves both safety and utility performance. The method addresses modality-induced bias by projecting cross-modal features onto the null space of identified bias directions, breaking the traditional safety-utility tradeoff.

AIBullisharXiv – CS AI · Mar 176/10

🧠

AnoleVLA: Lightweight Vision-Language-Action Model with Deep State Space Models for Mobile Manipulation

Researchers have developed AnoleVLA, a lightweight Vision-Language-Action model for robotic manipulation that uses deep state space models instead of traditional transformers. The model achieved 21 points higher task success rate than large-scale VLAs while running three times faster, making it suitable for resource-constrained robotic applications.

AIBullisharXiv – CS AI · Mar 176/10

🧠

RAZOR: Ratio-Aware Layer Editing for Targeted Unlearning in Vision Transformers and Diffusion Models

Researchers introduce RAZOR, a new framework for efficiently removing sensitive information from AI models like CLIP and Stable Diffusion without requiring full retraining. The method selectively edits specific layers and attention heads in transformer models to achieve targeted 'unlearning' while preserving overall performance.

🧠 Stable Diffusion

AIBullisharXiv – CS AI · Mar 176/10

🧠

SimCert: Probabilistic Certification for Behavioral Similarity in Deep Neural Network Compression

Researchers developed SimCert, a probabilistic certification framework that verifies behavioral similarity between compressed neural networks and their original versions. The framework addresses critical safety challenges in deploying compressed DNNs on resource-constrained systems by providing quantitative safety guarantees with adjustable confidence levels.

AIBullisharXiv – CS AI · Mar 176/10

🧠

Universe Routing: Why Self-Evolving Agents Need Epistemic Control

Researchers propose a 'universe routing' solution for AI agents that struggle to choose appropriate reasoning frameworks when faced with different types of questions. The study shows that hard routing to specialized solvers is 7x faster than soft mixing approaches, with a 465M-parameter router achieving superior generalization and zero forgetting in continual learning scenarios.

🏢 Meta

AIBullisharXiv – CS AI · Mar 176/10

🧠

Nudging Hidden States: Training-Free Model Steering for Chain-of-Thought Reasoning in Large Audio-Language Models

Researchers developed training-free model steering techniques to improve reasoning in large audio-language models (LALMs) through chain-of-thought prompting. The approach achieved up to 4.4% accuracy gains and demonstrated cross-modal transfer where text-derived steering vectors can effectively guide speech-based reasoning.

AINeutralarXiv – CS AI · Mar 176/10

🧠

Compute Allocation for Reasoning-Intensive Retrieval Agents

Researchers studied computational resource allocation in AI retrieval systems for long-horizon agents, finding that re-ranking stages benefit more from powerful models and deeper candidate pools than query expansion stages. The study suggests concentrating compute power on re-ranking rather than distributing it uniformly across pipeline stages for better performance.

🧠 Gemini

AIBullisharXiv – CS AI · Mar 176/10

🧠

$PA^3$: $\textbf{P}$olicy-$\textbf{A}$ware $\textbf{A}$gent $\textbf{A}$lignment through Chain-of-Thought

Researchers developed PA³, a new method to improve AI assistant alignment with business policies by teaching models to recall and apply relevant rules during reasoning without including full policies in prompts. The approach reduces computational overhead by 40% while achieving 16-point performance improvements over baselines.

$PA

AIBearisharXiv – CS AI · Mar 176/10

🧠

The Scenic Route to Deception: Dark Patterns and Explainability Pitfalls in Conversational Navigation

Researchers warn that AI-powered conversational navigation systems using Large Language Models could transform route guidance from verifiable geometric tasks into manipulative dialogues. The study proposes a framework categorizing risks as dark patterns or explainability pitfalls, suggesting neuro-symbolic architectures to maintain trustworthiness.

AINeutralarXiv – CS AI · Mar 176/10

🧠

MALicious INTent Dataset and Inoculating LLMs for Enhanced Disinformation Detection

Researchers released MALINT, the first human-annotated English dataset for detecting disinformation and its malicious intent, developed with expert fact-checkers. The study benchmarked 12 language models and introduced intent-based inoculation techniques that improved zero-shot disinformation detection across six datasets, five LLMs, and seven languages.

🧠 Llama

AINeutralarXiv – CS AI · Mar 176/10

🧠

Infinite Problem Generator: Verifiably Scaling Physics Reasoning Data with Agentic Workflows

Researchers introduce the Infinite Problem Generator (IPG), an AI framework that creates verifiable physics problems using executable Python code instead of probabilistic text generation. The system released ClassicalMechanicsV1, a dataset of 1,335 physics problems that demonstrates how code complexity can precisely measure problem difficulty for training large language models.

AIBullisharXiv – CS AI · Mar 176/10

🧠

Distilling Reasoning Without Knowledge: A Framework for Reliable LLMs

Researchers propose a new framework for large language models that separates planning from factual retrieval to improve reliability in fact-seeking question answering. The modular approach uses a lightweight student planner trained via teacher-student learning to generate structured reasoning steps, showing improved accuracy and speed on challenging benchmarks.

AIBullisharXiv – CS AI · Mar 176/10

🧠

VLA-Thinker: Boosting Vision-Language-Action Models through Thinking-with-Image Reasoning

Researchers introduce VLA-Thinker, a new AI framework that enhances Vision-Language-Action models by enabling dynamic visual reasoning during robotic tasks. The system achieved a 97.5% success rate on LIBERO benchmarks through a two-stage training pipeline combining supervised fine-tuning and reinforcement learning.

AIBullisharXiv – CS AI · Mar 176/10

🧠

ES-Merging: Biological MLLM Merging via Embedding Space Signals

Researchers propose ES-Merging, a new framework for combining specialized biological multimodal large language models (MLLMs) by using embedding space signals rather than traditional parameter-based methods. The approach estimates merging coefficients at both layer-wise and element-wise granularities, outperforming existing merging techniques and even task-specific fine-tuned models on cross-modal scientific problems.

AIBullisharXiv – CS AI · Mar 176/10

🧠

OxyGen: Unified KV Cache Management for Vision-Language-Action Models under Multi-Task Parallelism

Researchers propose OxyGen, a unified KV cache management system for Vision-Language-Action Models that enables efficient multi-task parallelism in embodied AI agents. The system achieves up to 3.7x speedup by sharing computational resources across tasks and eliminating redundant processing of shared observations.

AIBullisharXiv – CS AI · Mar 176/10

🧠

AerialVLA: A Vision-Language-Action Model for UAV Navigation via Minimalist End-to-End Control

Researchers propose AerialVLA, a minimalist end-to-end Vision-Language-Action framework for UAV navigation that directly maps visual observations and linguistic instructions to continuous control signals. The system eliminates reliance on external object detectors and dense oracle guidance, achieving nearly three times the success rate of existing baselines in unseen environments.

AIBullisharXiv – CS AI · Mar 176/10

🧠

Deconfounded Lifelong Learning for Autonomous Driving via Dynamic Knowledge Spaces

Researchers propose DeLL, a new framework for autonomous driving systems that addresses lifelong learning challenges through dynamic knowledge spaces and causal inference mechanisms. The system uses Dirichlet process mixture models to prevent catastrophic forgetting and improve adaptability to new driving scenarios while maintaining previously learned knowledge.

AINeutralarXiv – CS AI · Mar 176/10

🧠

AEX: Non-Intrusive Multi-Hop Attestation and Provenance for LLM APIs

Researchers propose AEX, a new attestation protocol for LLM APIs that provides cryptographic proof that API responses actually correspond to client requests. The system addresses trust issues with hosted AI models by adding signed attestation objects to existing JSON-based APIs without disrupting current functionality.

🏢 OpenAI

AIBullisharXiv – CS AI · Mar 176/10

🧠

Mitigating Overthinking in Large Reasoning Language Models via Reasoning Path Deviation Monitoring

Researchers propose a new early-exit method for Large Reasoning Language Models that detects and prevents overthinking by monitoring high-entropy transition tokens that indicate deviation from correct reasoning paths. The method improves performance and efficiency compared to existing approaches without requiring additional training overhead or limiting inference throughput.

← PrevPage 507 of 842Next →