🧠

AI

21,049 AI articles curated from 50+ sources with AI-powered sentiment analysis, importance scoring, and key takeaways.

21049 articles

AIBearisharXiv – CS AI · Mar 176/10

🧠

I'm Not Reading All of That: Understanding Software Engineers' Level of Cognitive Engagement with Agentic Coding Assistants

A research study reveals that software engineers' cognitive engagement consistently declines when working with agentic AI coding assistants, raising concerns about over-reliance and reduced critical thinking. The study found that current AI assistants provide limited support for reflection and verification, identifying design opportunities to promote deeper thinking in AI-assisted programming.

AIBullisharXiv – CS AI · Mar 176/10

🧠

Self-Indexing KVCache: Predicting Sparse Attention from Compressed Keys

Researchers propose a novel self-indexing KV cache system that unifies compression and retrieval for efficient sparse attention in large language models. The method uses 1-bit vector quantization and integrates with FlashAttention to reduce memory bottlenecks in long-context LLM inference.

AIBullisharXiv – CS AI · Mar 176/10

🧠

Mitigating Overthinking in Large Reasoning Language Models via Reasoning Path Deviation Monitoring

Researchers propose a new early-exit method for Large Reasoning Language Models that detects and prevents overthinking by monitoring high-entropy transition tokens that indicate deviation from correct reasoning paths. The method improves performance and efficiency compared to existing approaches without requiring additional training overhead or limiting inference throughput.

AINeutralarXiv – CS AI · Mar 176/10

🧠

Deeper Thought, Weaker Aim: Understanding and Mitigating Perceptual Impairment during Reasoning in Multimodal Large Language Models

Researchers have identified that multimodal large language models (MLLMs) lose visual focus during complex reasoning tasks, with attention becoming scattered across images rather than staying on relevant regions. They propose a training-free Visual Region-Guided Attention (VRGA) framework that improves visual grounding and reasoning accuracy by reweighting attention to question-relevant areas.

AIBullisharXiv – CS AI · Mar 176/10

🧠

$PA^3$: $\textbf{P}$olicy-$\textbf{A}$ware $\textbf{A}$gent $\textbf{A}$lignment through Chain-of-Thought

Researchers developed PA³, a new method to improve AI assistant alignment with business policies by teaching models to recall and apply relevant rules during reasoning without including full policies in prompts. The approach reduces computational overhead by 40% while achieving 16-point performance improvements over baselines.

$PA

AINeutralarXiv – CS AI · Mar 176/10

🧠

AEX: Non-Intrusive Multi-Hop Attestation and Provenance for LLM APIs

Researchers propose AEX, a new attestation protocol for LLM APIs that provides cryptographic proof that API responses actually correspond to client requests. The system addresses trust issues with hosted AI models by adding signed attestation objects to existing JSON-based APIs without disrupting current functionality.

🏢 OpenAI

AINeutralarXiv – CS AI · Mar 176/10

🧠

Citation-Enforced RAG for Fiscal Document Intelligence: Cited, Explainable Knowledge Retrieval in Tax Compliance

Researchers have developed a new AI framework that uses citation-enforced retrieval-augmented generation (RAG) specifically for analyzing tax and fiscal documents. The system prioritizes transparency and explainability for tax authorities, showing improved citation accuracy and reduced AI hallucinations when tested on real IRS documents.

AIBullisharXiv – CS AI · Mar 176/10

🧠

Diffusion Reinforcement Learning via Centered Reward Distillation

Researchers present Centered Reward Distillation (CRD), a new reinforcement learning framework for fine-tuning diffusion models that addresses brittleness issues in existing methods. The approach uses within-prompt centering and drift control techniques to achieve state-of-the-art performance in text-to-image generation while reducing reward hacking and convergence issues.

AINeutralarXiv – CS AI · Mar 176/10

🧠

Concisely Explaining the Doubt: Minimum-Size Abductive Explanations for Linear Models with a Reject Option

Researchers developed a method to compute minimum-size abductive explanations for AI linear models with reject options, addressing a key challenge in explainable AI for critical domains. The approach uses log-linear algorithms for accepted instances and integer linear programming for rejected instances, proving more efficient than existing methods despite theoretical NP-hardness.

AIBullisharXiv – CS AI · Mar 176/10

🧠

Not All Latent Spaces Are Flat: Hyperbolic Concept Control

Researchers introduced HyCon, a hyperbolic control mechanism for text-to-image models that provides better safety controls by steering generation away from unsafe content. The technique uses hyperbolic representation spaces instead of traditional Euclidean adjustments, achieving state-of-the-art results across multiple safety benchmarks.

AIBullisharXiv – CS AI · Mar 176/10

🧠

UVLM: A Universal Vision-Language Model Loader for Reproducible Multimodal Benchmarking

Researchers have introduced UVLM (Universal Vision-Language Model Loader), a Google Colab-based framework that provides a unified interface for loading, configuring, and benchmarking multiple Vision-Language Model architectures. The framework currently supports LLaVA-NeXT and Qwen2.5-VL models and enables researchers to compare different VLMs using identical evaluation protocols on custom image analysis tasks.

AINeutralarXiv – CS AI · Mar 176/10

🧠

Is Seeing Believing? Evaluating Human Sensitivity to Synthetic Video

Research reveals that humans can detect credibility issues in deepfake videos through visual and audio distortions. Three experiments show that both technical artifacts and distortions in synthetic media reduce perceived credibility, though understanding of human perception of deepfakes remains limited.

AIBullisharXiv – CS AI · Mar 176/10

🧠

Collapse or Preserve: Data-Dependent Temporal Aggregation for Spiking Neural Network Acceleration

Researchers developed Temporal Aggregated Convolution (TAC) to accelerate spiking neural networks by aggregating spike frames before convolution, achieving 13.8x speedup on rate-coded data. The study reveals that optimal temporal aggregation strategies depend on data type - collapsing temporal dimensions for rate-coded data while preserving them for event-based data.

🏢 Nvidia

AIBullisharXiv – CS AI · Mar 176/10

🧠

IGU-LoRA: Adaptive Rank Allocation via Integrated Gradients and Uncertainty-Aware Scoring

Researchers introduce IGU-LoRA, a new parameter-efficient fine-tuning method for large language models that adaptively allocates ranks across layers using integrated gradients and uncertainty-aware scoring. The approach addresses limitations of existing methods like AdaLoRA by providing more stable and accurate layer importance estimates, consistently outperforming baselines across diverse tasks.

AIBullisharXiv – CS AI · Mar 176/10

🧠

Retrieval-Feedback-Driven Distillation and Preference Alignment for Efficient LLM-based Query Expansion

Researchers developed a framework to make large language model-based query expansion more efficient by distilling knowledge from powerful teacher models into compact student models. The approach uses retrieval feedback and preference alignment to maintain 97% of the original performance while dramatically reducing inference costs.

AIBullisharXiv – CS AI · Mar 176/10

🧠

Pixel-level Scene Understanding in One Token: Visual States Need What-is-Where Composition

Researchers propose CroBo, a new visual state representation learning framework that helps robotic agents better understand dynamic environments by encoding both semantic identities and spatial locations of scene elements. The framework uses a global-to-local reconstruction method that compresses observations into compact tokens, achieving state-of-the-art performance on robot policy learning benchmarks.

AIBullisharXiv – CS AI · Mar 176/10

🧠

Computation and Communication Efficient Federated Unlearning via On-server Gradient Conflict Mitigation and Expression

Researchers propose FOUL (Federated On-server Unlearning), a new framework for efficiently removing specific participants' data from federated learning models without accessing client data. The approach reduces computational and communication costs while maintaining privacy compliance through a two-stage process that performs unlearning operations on the server side.

AIBullisharXiv – CS AI · Mar 176/10

🧠

Knowledge Distillation for Large Language Models

Researchers developed a resource-efficient framework for compressing large language models using knowledge distillation and chain-of-thought reinforcement learning. The method successfully compressed Qwen 3B to 0.5B while retaining 70-95% of performance across English, Spanish, and coding tasks, making AI models more suitable for resource-constrained deployments.

AIBullisharXiv – CS AI · Mar 176/10

🧠

Locatability-Guided Adaptive Reasoning for Image Geo-Localization with Vision-Language Models

Researchers introduce Geo-ADAPT, a new AI framework using Vision-Language Models for image geo-localization that adapts reasoning depth based on image complexity. The system uses an Optimized Locatability Score and specialized dataset to achieve state-of-the-art performance while reducing AI hallucinations.

AINeutralarXiv – CS AI · Mar 176/10

🧠

QuarkMedBench: A Real-World Scenario Driven Benchmark for Evaluating Large Language Models

Researchers introduced QuarkMedBench, a new benchmark for evaluating large language models on real-world medical queries using over 20,000 queries across clinical care scenarios. The benchmark addresses limitations of current medical AI evaluations that rely on multiple-choice questions by using an automated scoring framework that achieves 91.8% concordance with clinical expert assessments.

AIBullisharXiv – CS AI · Mar 176/10

🧠

NCCL EP: Towards a Unified Expert Parallel Communication API for NCCL

Researchers have developed NCCL EP, a new communication library for Mixture-of-Experts (MoE) AI model architectures that improves GPU-initiated communication performance. The library provides unified APIs supporting both low-latency inference and high-throughput training modes, built entirely on NVIDIA's NCCL Device API.

🏢 Nvidia

AIBullisharXiv – CS AI · Mar 176/10

🧠

MR-GNF: Multi-Resolution Graph Neural Forecasting on Ellipsoidal Meshes for Efficient Regional Weather Prediction

Researchers developed MR-GNF, a lightweight AI model that performs regional weather forecasting using multi-resolution graph neural networks on ellipsoidal meshes. The model achieves competitive accuracy with traditional numerical weather prediction systems while using significantly less computational resources (under 80 GPU-hours on a single RTX 6000 Ada).

$ADA

AIBullisharXiv – CS AI · Mar 176/10

🧠

Resolving Interference (RI): Disentangling Models for Improved Model Merging

Researchers have developed Resolving Interference (RI), a new framework that improves AI model merging by reducing cross-task interference when combining specialized models. The method makes models functionally orthogonal to other tasks using only unlabeled data, improving merging performance by up to 3.8% and generalization by up to 2.3%.

AIBullisharXiv – CS AI · Mar 176/10

🧠

SmoothVLA: Aligning Vision-Language-Action Models with Physical Constraints via Intrinsic Smoothness Optimization

Researchers introduce SmoothVLA, a new reinforcement learning framework that improves robot control by optimizing both task performance and motion smoothness. The system addresses the trade-off between stability and exploration in Vision-Language-Action models, achieving 13.8% better smoothness than standard RL methods.

AIBullisharXiv – CS AI · Mar 176/10

🧠

REFINE-DP: Diffusion Policy Fine-tuning for Humanoid Loco-manipulation via Reinforcement Learning

Researchers developed REFINE-DP, a hierarchical framework that combines diffusion policies with reinforcement learning to enable humanoid robots to perform complex loco-manipulation tasks. The system achieves over 90% success rate in simulation and demonstrates smooth autonomous execution in real-world environments for tasks like door traversal and object transport.

← PrevPage 508 of 842Next →