y0news
AnalyticsDigestsSourcesTopicsRSSAICrypto

#arxiv News & Analysis

408 articles tagged with #arxiv. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

408 articles
AIBullisharXiv โ€“ CS AI ยท Mar 47/102
๐Ÿง 

NExT-Guard: Training-Free Streaming Safeguard without Token-Level Labels

Researchers introduce NExT-Guard, a training-free framework for real-time AI safety monitoring that uses Sparse Autoencoders to detect unsafe content in streaming language models. The system outperforms traditional supervised training methods while requiring no token-level annotations, making it more cost-effective and scalable for deployment.

AINeutralarXiv โ€“ CS AI ยท Mar 47/103
๐Ÿง 

Forecasting as Rendering: A 2D Gaussian Splatting Framework for Time Series Forecasting

Researchers introduce TimeGS, a novel time series forecasting framework that reimagines prediction as 2D generative rendering using Gaussian splatting techniques. The approach addresses key limitations in existing methods by treating future sequences as continuous latent surfaces and enforcing temporal continuity across periodic boundaries.

AIBullisharXiv โ€“ CS AI ยท Mar 47/103
๐Ÿง 

Mitigating Over-Refusal in Aligned Large Language Models via Inference-Time Activation Energy

Researchers introduce Energy Landscape Steering (ELS), a new framework that reduces false refusals in AI safety-aligned language models without compromising security. The method uses an external Energy-Based Model to dynamically guide model behavior during inference, improving compliance from 57.3% to 82.6% on safety benchmarks.

AIBearisharXiv โ€“ CS AI ยท Mar 46/103
๐Ÿง 

Contextual Drag: How Errors in the Context Affect LLM Reasoning

Researchers have identified 'contextual drag' - a phenomenon where large language models (LLMs) generate similar errors when failed attempts are present in their context. The study found 10-20% performance drops across 11 models on 8 reasoning tasks, with iterative self-refinement potentially leading to self-deterioration.

AIBullisharXiv โ€“ CS AI ยท Mar 47/104
๐Ÿง 

PRISM: Pushing the Frontier of Deep Think via Process Reward Model-Guided Inference

Researchers introduce PRISM, a new AI inference algorithm that uses Process Reward Models to guide deep reasoning systems. The method significantly improves performance on mathematical and scientific benchmarks by treating candidate solutions as particles in an energy landscape and using score-guided refinement to concentrate on higher-quality reasoning paths.

AIBullisharXiv โ€“ CS AI ยท Mar 37/105
๐Ÿง 

Arbor: A Framework for Reliable Navigation of Critical Conversation Flows

Researchers introduce Arbor, a framework that decomposes large language model decision-making into specialized node-level tasks for critical applications like healthcare triage. The system improves accuracy by 29.4 percentage points while reducing latency by 57.1% and costs by 14.4x compared to single-prompt approaches.

AIBullisharXiv โ€“ CS AI ยท Mar 37/103
๐Ÿง 

RLP: Reinforcement as a Pretraining Objective

Researchers introduce RLP (Reinforcement Learning Pretraining), a new training method that incorporates reinforcement learning exploration into the pretraining phase rather than only post-training. The approach treats chain-of-thought reasoning as exploratory actions and achieved 19% performance improvements on math and science benchmarks across different model architectures.

$COMP
AIBullisharXiv โ€“ CS AI ยท Mar 37/105
๐Ÿง 

Vid-LLM: A Compact Video-based 3D Multimodal LLM with Reconstruction-Reasoning Synergy

Researchers propose Vid-LLM, a new video-based 3D multimodal large language model that processes video inputs without requiring external 3D data for scene understanding. The model uses a Cross-Task Adapter module and Metric Depth Model to integrate geometric cues and maintain consistency across 3D tasks like question answering and visual grounding.

AIBullisharXiv โ€“ CS AI ยท Mar 37/103
๐Ÿง 

Dream2Learn: Structured Generative Dreaming for Continual Learning

Researchers introduce Dream2Learn (D2L), a continual learning framework that enables AI models to generate synthetic training data from their own internal representations, mimicking human dreaming for knowledge consolidation. The system creates novel 'dreamed classes' using diffusion models to improve forward knowledge transfer and prevent catastrophic forgetting in neural networks.

AIBullisharXiv โ€“ CS AI ยท Mar 37/105
๐Ÿง 

Elo-Evolve: A Co-evolutionary Framework for Language Model Alignment

Researchers introduce Elo-Evolve, a new framework for training AI language models using dynamic multi-agent competition instead of static reward functions. The method achieves 4.5x noise reduction and demonstrates superior performance compared to traditional alignment approaches when tested on Qwen2.5-7B models.

AIBullisharXiv โ€“ CS AI ยท Mar 37/104
๐Ÿง 

BridgeDrive: Diffusion Bridge Policy for Closed-Loop Trajectory Planning in Autonomous Driving

BridgeDrive introduces a novel diffusion bridge policy for autonomous driving trajectory planning that transforms coarse anchor trajectories into refined plans while maintaining theoretical consistency. The system achieves state-of-the-art performance on the Bench2Drive benchmark with a 7.72% improvement in success rate and is compatible with real-time deployment.

AIBullisharXiv โ€“ CS AI ยท Mar 37/104
๐Ÿง 

AgentOCR: Reimagining Agent History via Optical Self-Compression

Researchers introduce AgentOCR, a framework that converts AI agent interaction histories from text to compressed visual format, reducing token usage by over 50% while maintaining 95% performance. The system uses visual caching and adaptive compression to address memory bottlenecks in large language model deployments.

AIBullisharXiv โ€“ CS AI ยท Mar 37/103
๐Ÿง 

ExGRPO: Learning to Reason from Experience

Researchers introduce ExGRPO, a new framework that improves AI reasoning by reusing and prioritizing valuable training experiences based on correctness and entropy. The method shows consistent performance gains of +3.5-7.6 points over standard approaches across multiple model sizes while providing more stable training.

AINeutralarXiv โ€“ CS AI ยท Mar 37/104
๐Ÿง 

Reasoning or Retrieval? A Study of Answer Attribution on Large Reasoning Models

Researchers discovered that large reasoning models (LRMs) suffer from inconsistent answers due to competing mechanisms between Chain-of-Thought reasoning and memory retrieval. They developed FARL, a new fine-tuning framework that suppresses retrieval shortcuts to promote genuine reasoning capabilities in AI models.

AIBullisharXiv โ€“ CS AI ยท Mar 37/104
๐Ÿง 

Overcoming Joint Intractability with Lossless Hierarchical Speculative Decoding

Researchers have developed Hierarchical Speculative Decoding (HSD), a new method that significantly improves AI inference speed while maintaining accuracy by solving joint intractability problems in verification processes. The technique shows over 12% performance gains when integrated with existing frameworks like EAGLE-3, establishing new state-of-the-art efficiency standards.

AINeutralarXiv โ€“ CS AI ยท Mar 37/104
๐Ÿง 

How Do LLMs Use Their Depth?

New research reveals that large language models use a "Guess-then-Refine" framework, starting with high-frequency token predictions in early layers and refining them with contextual information in deeper layers. The study provides detailed insights into layer-wise computation dynamics through multiple-choice tasks, fact recall analysis, and part-of-speech predictions.

AIBullisharXiv โ€“ CS AI ยท Mar 37/103
๐Ÿง 

Value Flows

Researchers have developed Value Flows, a new reinforcement learning method that uses flow-based models to estimate complete return distributions rather than single scalar values. The approach achieves 1.3x improvement in success rates across 62 benchmark tasks by better identifying states with high return uncertainty for improved decision-making.

AIBullisharXiv โ€“ CS AI ยท Mar 37/104
๐Ÿง 

SwiReasoning: Switch-Thinking in Latent and Explicit for Pareto-Superior Reasoning LLMs

Researchers introduce SwiReasoning, a training-free framework that improves large language model reasoning by dynamically switching between explicit chain-of-thought and latent reasoning modes. The method achieves 1.8%-3.1% accuracy improvements and 57%-79% better token efficiency across mathematics, STEM, coding, and general benchmarks.

AIBullisharXiv โ€“ CS AI ยท Mar 37/102
๐Ÿง 

Reasoning on Time-Series for Financial Technical Analysis

Researchers introduce Verbal Technical Analysis (VTA), a framework that combines Large Language Models with time-series analysis to produce interpretable stock forecasts. The system converts stock price data into textual annotations and uses natural language reasoning to achieve state-of-the-art forecasting accuracy across U.S., Chinese, and European markets.

AIBullisharXiv โ€“ CS AI ยท Mar 37/104
๐Ÿง 

Self-Harmony: Learning to Harmonize Self-Supervision and Self-Play in Test-Time Reinforcement Learning

Researchers introduce Self-Harmony, a new test-time reinforcement learning framework that improves AI model accuracy by having models solve problems and rephrase questions simultaneously. The method uses harmonic mean aggregation instead of majority voting to select stable answers, achieving state-of-the-art results across 28 of 30 reasoning benchmarks without requiring human supervision.

AIBullisharXiv โ€“ CS AI ยท Mar 37/104
๐Ÿง 

Learning from Synthetic Data Improves Multi-hop Reasoning

Researchers demonstrated that large language models can improve multi-hop reasoning performance by training on rule-generated synthetic data instead of expensive human annotations or frontier LLM outputs. The study found that LLMs trained on synthetic fictional data performed better on real-world question-answering benchmarks by learning fundamental knowledge composition skills.

AIBullisharXiv โ€“ CS AI ยท Mar 37/103
๐Ÿง 

SageBwd: A Trainable Low-bit Attention

Researchers have developed SageBwd, a trainable INT8 attention mechanism that can match full-precision attention performance during pre-training while quantizing six of seven attention matrix multiplications. The study identifies key factors for stable training including QK-norm requirements and the impact of tokens per step on quantization errors.

AIBullisharXiv โ€“ CS AI ยท Mar 37/104
๐Ÿง 

Distribution-Aligned Decoding for Efficient LLM Task Adaptation

Researchers introduce SVDecode, a new method for adapting large language models to specific tasks without extensive fine-tuning. The technique uses steering vectors during decoding to align output distributions with task requirements, improving accuracy by up to 5 percentage points while adding minimal computational overhead.

AIBullisharXiv โ€“ CS AI ยท Mar 37/103
๐Ÿง 

AdaRank: Adaptive Rank Pruning for Enhanced Model Merging

Researchers introduce AdaRank, a new AI model merging framework that adaptively selects optimal singular directions from task vectors to combine multiple fine-tuned models. The technique addresses cross-task interference issues in existing SVD-based approaches by dynamically pruning problematic components during test-time, achieving state-of-the-art performance with nearly 1% gap from individual fine-tuned models.