AI Pulse News

Models, papers, tools. 19,527 articles with AI-powered sentiment analysis and key takeaways.

19527 articles

AINeutralarXiv – CS AI · Mar 176/10

🧠

QuarkMedBench: A Real-World Scenario Driven Benchmark for Evaluating Large Language Models

Researchers introduced QuarkMedBench, a new benchmark for evaluating large language models on real-world medical queries using over 20,000 queries across clinical care scenarios. The benchmark addresses limitations of current medical AI evaluations that rely on multiple-choice questions by using an automated scoring framework that achieves 91.8% concordance with clinical expert assessments.

AIBullisharXiv – CS AI · Mar 176/10

🧠

REFINE-DP: Diffusion Policy Fine-tuning for Humanoid Loco-manipulation via Reinforcement Learning

Researchers developed REFINE-DP, a hierarchical framework that combines diffusion policies with reinforcement learning to enable humanoid robots to perform complex loco-manipulation tasks. The system achieves over 90% success rate in simulation and demonstrates smooth autonomous execution in real-world environments for tasks like door traversal and object transport.

AIBullisharXiv – CS AI · Mar 176/10

🧠

Knowledge Distillation for Large Language Models

Researchers developed a resource-efficient framework for compressing large language models using knowledge distillation and chain-of-thought reinforcement learning. The method successfully compressed Qwen 3B to 0.5B while retaining 70-95% of performance across English, Spanish, and coding tasks, making AI models more suitable for resource-constrained deployments.

AIBullisharXiv – CS AI · Mar 176/10

🧠

Retrieval-Feedback-Driven Distillation and Preference Alignment for Efficient LLM-based Query Expansion

Researchers developed a framework to make large language model-based query expansion more efficient by distilling knowledge from powerful teacher models into compact student models. The approach uses retrieval feedback and preference alignment to maintain 97% of the original performance while dramatically reducing inference costs.

AIBullisharXiv – CS AI · Mar 176/10

🧠

IGU-LoRA: Adaptive Rank Allocation via Integrated Gradients and Uncertainty-Aware Scoring

Researchers introduce IGU-LoRA, a new parameter-efficient fine-tuning method for large language models that adaptively allocates ranks across layers using integrated gradients and uncertainty-aware scoring. The approach addresses limitations of existing methods like AdaLoRA by providing more stable and accurate layer importance estimates, consistently outperforming baselines across diverse tasks.

AIBullisharXiv – CS AI · Mar 176/10

🧠

Computation and Communication Efficient Federated Unlearning via On-server Gradient Conflict Mitigation and Expression

Researchers propose FOUL (Federated On-server Unlearning), a new framework for efficiently removing specific participants' data from federated learning models without accessing client data. The approach reduces computational and communication costs while maintaining privacy compliance through a two-stage process that performs unlearning operations on the server side.

AIBullisharXiv – CS AI · Mar 176/10

🧠

Collapse or Preserve: Data-Dependent Temporal Aggregation for Spiking Neural Network Acceleration

Researchers developed Temporal Aggregated Convolution (TAC) to accelerate spiking neural networks by aggregating spike frames before convolution, achieving 13.8x speedup on rate-coded data. The study reveals that optimal temporal aggregation strategies depend on data type - collapsing temporal dimensions for rate-coded data while preserving them for event-based data.

🏢 Nvidia

AINeutralarXiv – CS AI · Mar 176/10

🧠

Is Seeing Believing? Evaluating Human Sensitivity to Synthetic Video

Research reveals that humans can detect credibility issues in deepfake videos through visual and audio distortions. Three experiments show that both technical artifacts and distortions in synthetic media reduce perceived credibility, though understanding of human perception of deepfakes remains limited.

AIBullisharXiv – CS AI · Mar 176/10

🧠

UVLM: A Universal Vision-Language Model Loader for Reproducible Multimodal Benchmarking

Researchers have introduced UVLM (Universal Vision-Language Model Loader), a Google Colab-based framework that provides a unified interface for loading, configuring, and benchmarking multiple Vision-Language Model architectures. The framework currently supports LLaVA-NeXT and Qwen2.5-VL models and enables researchers to compare different VLMs using identical evaluation protocols on custom image analysis tasks.

AIBullisharXiv – CS AI · Mar 176/10

🧠

Pixel-level Scene Understanding in One Token: Visual States Need What-is-Where Composition

Researchers propose CroBo, a new visual state representation learning framework that helps robotic agents better understand dynamic environments by encoding both semantic identities and spatial locations of scene elements. The framework uses a global-to-local reconstruction method that compresses observations into compact tokens, achieving state-of-the-art performance on robot policy learning benchmarks.

AIBullisharXiv – CS AI · Mar 176/10

🧠

SmoothVLA: Aligning Vision-Language-Action Models with Physical Constraints via Intrinsic Smoothness Optimization

Researchers introduce SmoothVLA, a new reinforcement learning framework that improves robot control by optimizing both task performance and motion smoothness. The system addresses the trade-off between stability and exploration in Vision-Language-Action models, achieving 13.8% better smoothness than standard RL methods.

AIBullisharXiv – CS AI · Mar 176/10

🧠

LLM-Guided Reinforcement Learning for Audio-Visual Speech Enhancement

Researchers have developed a new audio-visual speech enhancement framework that uses Large Language Models and reinforcement learning to improve speech quality. The method outperforms existing baselines by using LLM-generated natural language feedback as rewards for model training, providing more interpretable optimization compared to traditional scalar metrics.

AIBullisharXiv – CS AI · Mar 176/10

🧠

Not All Latent Spaces Are Flat: Hyperbolic Concept Control

Researchers introduced HyCon, a hyperbolic control mechanism for text-to-image models that provides better safety controls by steering generation away from unsafe content. The technique uses hyperbolic representation spaces instead of traditional Euclidean adjustments, achieving state-of-the-art results across multiple safety benchmarks.

AINeutralarXiv – CS AI · Mar 176/10

🧠

Concisely Explaining the Doubt: Minimum-Size Abductive Explanations for Linear Models with a Reject Option

Researchers developed a method to compute minimum-size abductive explanations for AI linear models with reject options, addressing a key challenge in explainable AI for critical domains. The approach uses log-linear algorithms for accepted instances and integer linear programming for rejected instances, proving more efficient than existing methods despite theoretical NP-hardness.

AIBullisharXiv – CS AI · Mar 176/10

🧠

Diffusion Reinforcement Learning via Centered Reward Distillation

Researchers present Centered Reward Distillation (CRD), a new reinforcement learning framework for fine-tuning diffusion models that addresses brittleness issues in existing methods. The approach uses within-prompt centering and drift control techniques to achieve state-of-the-art performance in text-to-image generation while reducing reward hacking and convergence issues.

AINeutralarXiv – CS AI · Mar 176/10

🧠

Citation-Enforced RAG for Fiscal Document Intelligence: Cited, Explainable Knowledge Retrieval in Tax Compliance

Researchers have developed a new AI framework that uses citation-enforced retrieval-augmented generation (RAG) specifically for analyzing tax and fiscal documents. The system prioritizes transparency and explainability for tax authorities, showing improved citation accuracy and reduced AI hallucinations when tested on real IRS documents.

AINeutralarXiv – CS AI · Mar 176/10

🧠

Deeper Thought, Weaker Aim: Understanding and Mitigating Perceptual Impairment during Reasoning in Multimodal Large Language Models

Researchers have identified that multimodal large language models (MLLMs) lose visual focus during complex reasoning tasks, with attention becoming scattered across images rather than staying on relevant regions. They propose a training-free Visual Region-Guided Attention (VRGA) framework that improves visual grounding and reasoning accuracy by reweighting attention to question-relevant areas.

AIBullisharXiv – CS AI · Mar 176/10

🧠

Self-Indexing KVCache: Predicting Sparse Attention from Compressed Keys

Researchers propose a novel self-indexing KV cache system that unifies compression and retrieval for efficient sparse attention in large language models. The method uses 1-bit vector quantization and integrates with FlashAttention to reduce memory bottlenecks in long-context LLM inference.

AIBearisharXiv – CS AI · Mar 176/10

🧠

I'm Not Reading All of That: Understanding Software Engineers' Level of Cognitive Engagement with Agentic Coding Assistants

A research study reveals that software engineers' cognitive engagement consistently declines when working with agentic AI coding assistants, raising concerns about over-reliance and reduced critical thinking. The study found that current AI assistants provide limited support for reflection and verification, identifying design opportunities to promote deeper thinking in AI-assisted programming.

AIBullisharXiv – CS AI · Mar 176/10

🧠

Mitigating Overthinking in Large Reasoning Language Models via Reasoning Path Deviation Monitoring

Researchers propose a new early-exit method for Large Reasoning Language Models that detects and prevents overthinking by monitoring high-entropy transition tokens that indicate deviation from correct reasoning paths. The method improves performance and efficiency compared to existing approaches without requiring additional training overhead or limiting inference throughput.

AINeutralarXiv – CS AI · Mar 176/10

🧠

AEX: Non-Intrusive Multi-Hop Attestation and Provenance for LLM APIs

Researchers propose AEX, a new attestation protocol for LLM APIs that provides cryptographic proof that API responses actually correspond to client requests. The system addresses trust issues with hosted AI models by adding signed attestation objects to existing JSON-based APIs without disrupting current functionality.

🏢 OpenAI

AIBullisharXiv – CS AI · Mar 176/10

🧠

Deconfounded Lifelong Learning for Autonomous Driving via Dynamic Knowledge Spaces

Researchers propose DeLL, a new framework for autonomous driving systems that addresses lifelong learning challenges through dynamic knowledge spaces and causal inference mechanisms. The system uses Dirichlet process mixture models to prevent catastrophic forgetting and improve adaptability to new driving scenarios while maintaining previously learned knowledge.

AIBullisharXiv – CS AI · Mar 176/10

🧠

AerialVLA: A Vision-Language-Action Model for UAV Navigation via Minimalist End-to-End Control

Researchers propose AerialVLA, a minimalist end-to-end Vision-Language-Action framework for UAV navigation that directly maps visual observations and linguistic instructions to continuous control signals. The system eliminates reliance on external object detectors and dense oracle guidance, achieving nearly three times the success rate of existing baselines in unseen environments.

AIBullisharXiv – CS AI · Mar 176/10

🧠

OxyGen: Unified KV Cache Management for Vision-Language-Action Models under Multi-Task Parallelism

Researchers propose OxyGen, a unified KV cache management system for Vision-Language-Action Models that enables efficient multi-task parallelism in embodied AI agents. The system achieves up to 3.7x speedup by sharing computational resources across tasks and eliminating redundant processing of shared observations.

AIBullisharXiv – CS AI · Mar 176/10

🧠

From $\boldsymbol{\log\pi}$ to $\boldsymbol{\pi}$: Taming Divergence in Soft Clipping via Bilateral Decoupled Decay of Probability Gradient Weight

Researchers introduce Decoupled Gradient Policy Optimization (DGPO), a new reinforcement learning method that improves large language model training by using probability gradients instead of log-probability gradients. The technique addresses instability issues in current methods while maintaining exploration capabilities, showing superior performance across mathematical benchmarks.

← PrevPage 377 of 782Next →