AI Pulse News

Models, papers, tools. 31,118 articles with AI-powered sentiment analysis and key takeaways.

31118 articles

AINeutralarXiv – CS AI · Jun 26/10

🧠

From Graph Retrieval to Schema Realization: Counterfactual Validation for Text-to-SPARQL over Heterogeneous Knowledge Graphs

SchemaForge, a new AI framework, improves text-to-SPARQL query generation over heterogeneous knowledge graphs by using schema-grounded validation. The system achieves 11.5 percentage points higher accuracy than existing baselines across four benchmarks, demonstrating practical advances in natural language to database query translation.

AINeutralarXiv – CS AI · Jun 26/10

🧠

Graph is a Natural Regularization: Revisiting Vector Quantization for Graph Representation Learning

Researchers propose RGVQ, a novel framework addressing codebook collapse in Vector Quantization for graph neural networks, a technical limitation that degrades token expressiveness and generalization. By integrating graph topology as regularization and introducing soft assignments, RGVQ improves codebook utilization across downstream graph learning tasks.

AIBullisharXiv – CS AI · Jun 26/10

🧠

TuneAgent: Agentic Operating System Kernel Tuning with Reinforcement Learning

Researchers introduce TuneAgent, an AI-powered framework using reinforcement learning and large language models to automatically optimize Linux kernel configurations. The system achieves up to 5.6% performance improvements while maintaining configuration validity, addressing a longstanding challenge in OS optimization that traditionally requires manual expert tuning.

AINeutralarXiv – CS AI · Jun 25/10

🧠

Deep Learning as the Disciplined Construction of Tame Objects

A mathematical research paper proposes that deep learning models can be understood through tame geometry (o-minimality), a mathematical framework that enables convergence guarantees for stochastic gradient descent in nonsmooth, nonconvex settings. This perspective offers a formal mathematical foundation for analyzing AI system behavior and training stability.

AINeutralarXiv – CS AI · Jun 26/10

🧠

End-to-End Deep Learning for Predicting Metric Space-Valued Outputs

Researchers introduce E2M (End-to-End Metric regression), a deep learning framework that predicts non-Euclidean outputs like probability distributions and networks by computing weighted Fréchet means with neural network-learned weights. The method preserves geometric properties of output spaces while achieving state-of-the-art performance across multiple domains without requiring surrogate embeddings.

AIBullisharXiv – CS AI · Jun 26/10

🧠

T-POP: Test-Time Personalization with Online Preference Feedback

Researchers introduce T-POP, a novel algorithm that personalizes large language models in real-time by learning from user preference feedback during text generation, without requiring parameter updates or extensive pre-existing user data. The method combines test-time alignment with dueling bandits to efficiently balance exploration and exploitation, addressing the cold-start problem in LLM personalization.

AINeutralarXiv – CS AI · Jun 26/10

🧠

Make a Video Call with LLM: A Measurement Campaign over Six Mainstream Apps

Researchers conducted the first systematic performance benchmark of AI video chat systems across six mainstream applications, measuring quality, latency, internal mechanisms, and system overhead. The study reveals that network latency impacts AI video calls less significantly than human video calls, while AI agent capabilities emerge as the primary driver of user experience.

AINeutralarXiv – CS AI · Jun 26/10

🧠

Simultaneous Multi-objective Alignment Across Verifiable and Non-verifiable Rewards

Researchers propose MAHALO, a framework for training large language models across multiple competing objectives simultaneously, including verifiable tasks like math reasoning and non-verifiable subjective preferences like human values alignment. The approach uses PRM-guided decoding and Multi-Action-Head DPO to balance conflicting goals while maintaining user control during inference.

AINeutralarXiv – CS AI · Jun 25/10

🧠

HRTFformer: A Spatially-Aware Transformer for Individual HRTF Upsampling in Immersive Audio Rendering

Researchers introduce HRTFformer, a transformer-based neural network that improves the spatial upsampling of Head-Related Transfer Functions (HRTFs) used in immersive audio applications. By leveraging attention mechanisms and spherical harmonic domain processing, the model reconstructs high-fidelity spatial audio from sparse measurements with improved accuracy and realistic spatial coherence.

AINeutralarXiv – CS AI · Jun 26/10

🧠

Margin Adaptive DPO: Leveraging Reward Model for Granular Control in Preference Optimization

Researchers introduce Margin-Adaptive Direct Preference Optimization (MADPO), a novel method that improves large language model alignment by using a reward model to apply instance-level adaptive weights to training samples. MADPO addresses limitations in existing approaches like DPO and β-DPO by providing stable, granular control over the learning signal without discarding training data.

AIBullisharXiv – CS AI · Jun 26/10

🧠

Domain-Shift-Aware Conformal Prediction for Large Language Models

Researchers propose Domain-Shift-Aware Conformal Prediction (DS-CP), a framework that improves reliability of large language model outputs by adapting conformal prediction methods to handle domain shift. The approach reweights calibration samples based on proximity to test prompts, delivering more reliable uncertainty quantification and reducing hallucinations in real-world deployments.

AINeutralarXiv – CS AI · Jun 26/10

🧠

Catch-Only-One: Non-Transferable Examples for Model-Specific Authorization

Researchers introduce non-transferable examples (NTEs), a novel data encoding technique that restricts unauthorized model access while preserving utility for authorized applications. The method leverages model-specific low-sensitivity subspaces to act as cryptographic-like controls on AI data usage, addressing regulatory demands for purpose limitation without requiring model retraining or deployment control.

AINeutralarXiv – CS AI · Jun 26/10

🧠

Learning-To-Measure: In-Context Active Feature Acquisition

Researchers introduce Learning-to-Measure (L2M), a meta-learning framework that enables AI systems to learn optimal feature acquisition strategies across multiple tasks without task-specific retraining. The approach combines uncertainty quantification with a greedy acquisition agent, demonstrating superior performance on tabular datasets with missing features and limited labels.

AINeutralarXiv – CS AI · Jun 26/10

🧠

The Geometry of Grokking: Norm Minimization on the Zero-Loss Manifold

Researchers provide a mathematical framework explaining grokking—the phenomenon where neural networks suddenly generalize after memorizing training data. The study proves that gradient descent minimizes weight norms on the zero-loss manifold and derives closed-form expressions for post-memorization dynamics, offering theoretical clarity on this previously elusive learning behavior.

AINeutralarXiv – CS AI · Jun 26/10

🧠

Optimizing Diversity and Quality through Base-Aligned Model Collaboration

Researchers propose Base-Aligned Model Collaboration (BACo), an inference-time framework that dynamically combines base and aligned language models to improve both output diversity and quality simultaneously. The method uses token-level routing strategies based on uncertainty signals, achieving a 21.3% joint improvement in diversity-quality metrics without requiring expensive retraining or multi-pass decoding.

AINeutralarXiv – CS AI · Jun 25/10

🧠

NILC: Discovering New Intents with LLM-assisted Clustering

Researchers introduce NILC, a novel clustering framework that combines large language models with iterative refinement to improve new intent discovery in dialogue systems. Unlike traditional cascaded approaches relying solely on embedding-based K-Means clustering, NILC leverages LLMs to enhance cluster semantics and augment ambiguous utterances, demonstrating consistent performance gains across multiple benchmark datasets.

AINeutralarXiv – CS AI · Jun 26/10

🧠

RoboBenchMart: Benchmarking Robots in Retail Environment

Researchers introduced RoboBenchMart, an open-source simulated benchmark for evaluating robotic systems in retail dark-store environments. The study reveals that current state-of-the-art vision-language-action (VLA) models struggle with complex grocery manipulation tasks, indicating limitations in their generalization across diverse domains beyond tabletop scenarios.

AINeutralarXiv – CS AI · Jun 26/10

🧠

Evaluating the Performance of Deep Learning Models in Whole-body Dynamic 3D Posture Prediction During Load-reaching Activities

Researchers developed deep learning models using BLSTM and transformer architectures to predict full-body human posture during dynamic load-reaching tasks. A novel cost function enforcing constant body segment lengths improved prediction accuracy by 8-21%, with transformer models achieving 58% better long-term performance than LSTM alternatives.

AINeutralarXiv – CS AI · Jun 26/10

🧠

Understanding the Effects of Distractors on Reasoning Vision-Language Models

Researchers investigate how irrelevant visual information affects reasoning in vision-language models, finding that visual distractors reduce accuracy without lengthening reasoning traces—contrasting with textual distractors in language models. The study introduces a new dataset and proposes a prompting strategy to mitigate distractor-driven errors in multimodal AI systems.

AINeutralarXiv – CS AI · Jun 26/10

🧠

SpeedAug: Policy Acceleration via Tempo-Enriched Policy and RL Fine-Tuning

SpeedAug is a new reinforcement learning framework that accelerates robotic policy execution by learning optimal task speeds rather than relying on conservative demonstration data. The method combines tempo-enriched policy learning with RL fine-tuning to achieve 1.8x faster real-world task throughput while maintaining success rates.

AINeutralarXiv – CS AI · Jun 26/10

🧠

From Segments to Scenes: Temporal Understanding in Autonomous Driving via Vision-Language Model

Researchers introduce the Temporal Understanding in Autonomous Driving (TAD) benchmark, a dataset of nearly 6,000 QA pairs designed to evaluate vision-language models' ability to understand temporal sequences in driving scenarios. The study reveals that state-of-the-art VLMs significantly underperform on temporal reasoning tasks and proposes two training-free solutions—Scene-CoT and TCogMap—that improve accuracy by up to 17.72% on the benchmark.

🏢 Hugging Face

AIBullisharXiv – CS AI · Jun 26/10

🧠

ShelfAware: Real-Time Semantic Localization in Quasi-Static Environments with Low-Cost Sensors

ShelfAware is a semantic particle filter system that enables robust indoor localization in dynamic, cluttered environments using low-cost vision sensors. By treating scene semantics as statistical evidence rather than fixed landmarks, the technology achieves 97% global localization success in retail settings and outperforms existing geometric and semantic baselines.

AINeutralarXiv – CS AI · Jun 26/10

🧠

VocSim: A Training-free Benchmark for Zero-shot Content Identity in Single-source Audio

Researchers introduce VocSim, a training-free benchmark for evaluating audio embeddings' ability to identify content across diverse sound sources without parameter updates or labeled data. Testing 125k clips spanning speech, animal vocalizations, and environmental sounds, the study reveals that while frozen Whisper embeddings perform well overall, significant generalization gaps exist for low-resource and non-English languages, with implications for audio AI model development.

AINeutralarXiv – CS AI · Jun 26/10

🧠

InFerActive: Interactive Tree-Based Exploration of LLM Sampling for Safety Evaluation

InFerActive is an interactive system that improves how AI safety evaluators assess large language models by visualizing sampling results as navigable trees rather than static spreadsheets. The tool uses breadth-first sampling to achieve equivalent harmful-response coverage with up to 5x fewer samples, significantly improving evaluation efficiency according to controlled user studies.

AINeutralarXiv – CS AI · Jun 26/10

🧠

Calibrating Uncertainty for Zero-Shot Adversarial CLIP

Researchers propose an adversarial fine-tuning method for CLIP that addresses a critical gap in zero-shot classification: while perturbations degrade accuracy, they also suppress uncertainty estimates, causing overconfidence. The approach reparameterizes CLIP outputs as Dirichlet distribution parameters to jointly optimize for robustness and calibrated uncertainty, achieving competitive results across benchmarks.

← PrevPage 463 of 1245Next →