AI Pulse News

Models, papers, tools. 40,082 articles with AI-powered sentiment analysis and key takeaways.

40082 articles

AINeutralarXiv – CS AI · Jun 86/10

🧠

ChronoForest: Closed-Loop Multi-Tree Diffusion Planning for Efficient Bridge Search and Route Composition

ChronoForest introduces a closed-loop planning system that enables efficient long-horizon route planning by composing short offline trajectories, achieving 99.8% success on complex navigation benchmarks. The system addresses a critical challenge in offline navigation where collecting extensive long-range training data is impractical but agents must still solve extended tasks optimally.

AINeutralarXiv – CS AI · Jun 86/10

🧠

What Matters When Cotraining Robot Manipulation Policies on Everyday Human Videos?

Researchers demonstrate that everyday Internet videos can effectively train robot manipulation policies when combined with high-quality hand pose labels and specialized network architectures. Their approach achieves a 29.7% success rate improvement in low-data robot scenarios across multiple manipulation tasks, suggesting that abundant unstructured video data may supplement expensive curated robotic demonstrations.

AINeutralarXiv – CS AI · Jun 86/10

🧠

How Language Models Fail: Token-Level Signatures of Committed and Persistent Reasoning Failures

Researchers have identified two distinct failure modes in large language model reasoning: committed failures where models lock onto incorrect paths early, and persistent uncertainty failures where doubt accumulates throughout reasoning. The framework, validated across 23 model-dataset configurations, provides diagnostic signatures for detecting reasoning failures and offers practical implications for improving self-consistency methods.

AINeutralarXiv – CS AI · Jun 86/10

🧠

CAF-Gen: A Multi-Agent System for Enriching Argumentation Structures

CAF-Gen is a new multi-agent AI system that automatically enriches basic argument structures into complex, formally-structured argumentation models using the Carneades Argumentation Framework. The iterative Creator-Reviewer pipeline improves reasoning formalization in computational linguistics by validating outputs through collaborative feedback loops rather than single-pass generation.

AINeutralarXiv – CS AI · Jun 86/10

🧠

HKJudge: A Legal Discourse-Annotated Corpus for Interpreting What Courts Find, How They Reason, and What They Rule

Researchers introduce HKJudge, the first expert-annotated corpus of Hong Kong court judgments with ~290k sentences across all five court levels. The dataset enables analysis of judicial reasoning through 26 rhetorical roles and legal element extraction, establishing benchmarks for AI models in legal judgment prediction.

AINeutralarXiv – CS AI · Jun 86/10

🧠

ShallowBench: Benchmarking Generative Drug Design Models on Shallow-Pocket Targets

Researchers introduce ShallowBench, a curated benchmark of 5,780 shallow-pocket protein targets, revealing that current generative AI drug design models struggle with low-concavity binding sites common in challenging oncology targets like KRAS and MYC. The benchmark highlights a critical gap in generative biology that requires new architectural innovations to address historically undruggable targets.

AINeutralarXiv – CS AI · Jun 86/10

🧠

MSAIC-Net: A Multi-Scale Attention and Imbalance-Aware Contrastive Network for ECG-Based Myocardial Substrate Abnormality Detection

Researchers present MSAIC-Net, a deep learning framework that improves ECG-based detection of myocardial substrate abnormalities like scarring and heart attacks. The model combines multi-scale attention mechanisms with contrastive learning to address class imbalance and interpretability challenges, demonstrating strong performance on both institutional and public datasets.

AINeutralarXiv – CS AI · Jun 86/10

🧠

SCOUT: Semantic scene COverage via Uncertainty-guided Traversal

SCOUT is an online semantic exploration framework that enables robots to actively understand indoor environments by coupling real-time scene graph construction with uncertainty-guided traversal planning. The system builds 3D scene graphs with probabilistic object labels and structural relations, then uses uncertainty metrics to decide where robots should explore next, treating semantic scene completion as an operational objective rather than a passive mapping byproduct.

AINeutralarXiv – CS AI · Jun 86/10

🧠

Multilingual Multi-Speaker Unit Vocoders: A Systematic Analysis of Discrete Speech Representations

Researchers analyze how discrete speech units derived from self-supervised learning entangle phonetic, speaker, and language information in multilingual vocoder systems. The study demonstrates that cluster size directly controls intelligibility while explicit speaker conditioning prevents identity collapse, with implications for improving Audio LLMs and speech generation systems.

AINeutralarXiv – CS AI · Jun 86/10

🧠

HybridCodec: Fast Dual-Stream, Semantically Enhanced Neural Audio Codec

HybridCodec presents a novel neural audio codec architecture that combines semantic and acoustic feature streams while distilling SSL representations, achieving 3x speedup over existing dual-stream models. The advancement addresses the growing demand for efficient audio tokenizers in multimodal large language models by improving semantic specialization and cross-lingual robustness.

AINeutralarXiv – CS AI · Jun 86/10

🧠

Evidence Graph Consistency in Retrieval-Augmented Generation: A Model-Dependent Analysis of Hallucination Detection

Researchers propose Evidence Graph Consistency (EGC), a framework to detect hallucinations in Retrieval-Augmented Generation systems by analyzing structural relationships among evidence pieces. Testing across six LLMs reveals a critical finding: the method works as expected for Llama-2 but shows reversed diagnostic signals for GPT-4, GPT-3.5, and Mistral-7B, suggesting hallucination patterns differ fundamentally across model families.

🧠 GPT-4🧠 Llama

AINeutralarXiv – CS AI · Jun 86/10

🧠

AxisGuide: Grounding Robot Action Coordinate System in RGB Observations for Robust Visuomotor Manipulation

Researchers introduce AxisGuide, a lightweight method that improves robot manipulation by explicitly visualizing action coordinates in camera views. The technique augments visual observations with cues showing robot base-frame axes, enabling better generalization when objects are placed in unseen locations despite identical scene layouts.

AIBullisharXiv – CS AI · Jun 86/10

🧠

Mind the Gap: Bridging Behavioral Silos with LLMs in Multi-Vertical Recommendations

Researchers propose a novel framework using Large Language Models and Retrieval-Augmented Generation to address the cold-start problem in multi-vertical e-commerce platforms by transferring behavioral knowledge from data-rich verticals like restaurants to emerging categories like grocery and retail. The approach synthesizes hierarchical taxonomic features from user order histories and integrates them into a Multi-Task Learning ranking model, demonstrating improved personalization in production environments.

AINeutralarXiv – CS AI · Jun 86/10

🧠

Exploring Reinforcement Learning for Fluid Transitions Between Clinical Mental Healthcare and Everyday Wellness Support

Researchers deployed a reinforcement learning-based contextual bandit system to dynamically deliver mental healthcare and wellness interventions as a unified care journey. A four-week study (N=38) revealed that RL-optimized intervention sequences showed delayed benefits post-intervention and that users with higher engagement in RL-generated prompts sustained motivation better than those on fixed interventions, raising critical questions about pacing and intensity in blended clinical-wellness digital health systems.

AINeutralarXiv – CS AI · Jun 86/10

🧠

Lane Change Trajectory Planning for Personalized Driving Comfort and Mobility Efficiency

Researchers propose a neural network-based lane-change trajectory planner that uses dual-head architecture to balance safety guarantees with personalized driving preferences. The system adaptively switches between a baseline safe mode and a driver-specific comfort/efficiency mode based on contextual driving conditions, enabling autonomous vehicles to optimize maneuvers while maintaining feasibility across diverse scenarios.

AINeutralarXiv – CS AI · Jun 86/10

🧠

Breaking the Lock-in: Diversifying Text-to-Image Generation via Representation Modulation

Researchers present DAVE, a training-free method that enhances diversity in text-to-image generation by attenuating the DC (zero-frequency) component of intermediate Transformer features during early generation stages. The technique addresses the problem of identical outputs from the same prompt without requiring expensive sampling overhead or auxiliary optimization.

AIBullisharXiv – CS AI · Jun 86/10

🧠

SCALE: Scalable Cross-Attention Learning with Extrapolation for Agentic Workflow Scheduling

Researchers introduce SCALE, a deep reinforcement learning scheduler that enables LLM-based agentic systems to generalize across different cluster sizes without retraining. Using cross-attention architecture and a novel regularization technique, the system achieves 8.9% improvement in response times when scaled from 16 to 48 nodes, addressing a critical infrastructure challenge for distributed AI workloads.

AINeutralarXiv – CS AI · Jun 86/10

🧠

Progress-SQL: Improving Reinforcement Learning for Text-to-SQL via Progressive Rewards

Researchers introduce Progress-SQL, a reinforcement learning framework that improves large language models' ability to convert natural language queries into SQL code through multi-turn refinement with progressive reward signals. The method uses an Oracle-guided Diagnostic Tree to provide clause-level feedback and demonstrates consistent performance improvements across multiple benchmark datasets.

AINeutralarXiv – CS AI · Jun 86/10

🧠

Think Like a Pilot: Fine-Grained Long-Horizon UAV Navigation

Researchers introduce FLIGHT, a benchmark for training UAV agents to follow natural language instructions with precise, continuous flight control over long-horizon tasks. The accompanying FLIGHT VLA architecture decouples high-level reasoning from low-frequency control, advancing autonomous drone navigation beyond existing discrete-action systems.

AINeutralarXiv – CS AI · Jun 86/10

🧠

LLM Agent-Assisted Reverse Engineering with Quantitative Readability Metrics

Researchers present a Quantitative Readability Score (QRS) framework that enables LLM agents to improve the readability of decompiled code while maintaining functional correctness. The approach combines structural similarity validation with three independent readability metrics (Lexical Surprisal, Structural Simplicity, and Idiomatic Quality) to guide code refinement without unintended optimization artifacts.

AINeutralarXiv – CS AI · Jun 86/10

🧠

Characterize Then Distill: Mechanistic Reasoning in Large Output Spaces

Researchers have characterized how modern reasoning models achieve strong zero-shot performance on multi-label selection tasks by operating in two distinct phases: broad candidate shortlisting followed by fine-grained reasoning. This mechanistic understanding enables a more effective distillation strategy that outperforms standard knowledge transfer approaches.

AINeutralarXiv – CS AI · Jun 86/10

🧠

MotionEnhancer: Leveraging Video Diffusion for Motion-Enhanced Vision-Language Models

Researchers introduce MotionEnhancer, a novel technique that combines Video Diffusion Models with Vision-Language Models to improve fine-grained motion understanding in video analysis. The parameter-free approach uses attention alignment to extract motion priors without requiring additional training or architectural modifications, achieving consistent improvements on motion-understanding benchmarks.

AINeutralarXiv – CS AI · Jun 86/10

🧠

Modeling Nonlinear Feature Interactions with Product-Unit Residual Networks

Researchers introduce Product-Unit Residual Networks (PURe), a neural architecture that explicitly models nonlinear feature interactions through multiplicative units combined with residual connections. The approach demonstrates improved interpretability, robustness to noise, and sample efficiency compared to standard MLPs across synthetic and real-world datasets.

AINeutralarXiv – CS AI · Jun 85/10

🧠

EgoPressDiff: Multimodal Video Diffusion for Egocentric UV-Domain Hand-Pressure Estimation

EgoPressDiff presents a conditional video diffusion framework that estimates hand-surface contact pressure from egocentric viewpoints by generating UV-pressure maps from visual input. The method combines pose and mesh vertex features with a novel Distribution-Calibrated Spatial Layer to achieve 34% improvement in accuracy metrics, addressing limitations in AR/VR, robotics, and ergonomic applications.

AINeutralarXiv – CS AI · Jun 86/10

🧠

Neuro-Symbolic Learning for Long-Horizon Task Planning Under Complex Logical Constraints

Researchers present a neuro-symbolic learning framework that addresses a critical inefficiency in robotic task planning by combining neural networks with symbolic planning under complex logical constraints. The method uses bilevel optimization to learn object-importance scores while solving planning problems in pruned search spaces, reducing planning failures by 80% and planning time by 57% across multiple benchmarks and real-world robotic applications.

← PrevPage 539 of 1604Next →