y0news
AnalyticsDigestsSourcesTopicsRSSAICrypto

#arxiv News & Analysis

408 articles tagged with #arxiv. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

408 articles
AIBullisharXiv – CS AI · Mar 266/10
🧠

A Deep Dive into Scaling RL for Code Generation with Synthetic Data and Curricula

Researchers developed a scalable multi-turn synthetic data generation pipeline using reinforcement learning to improve large language models' code generation capabilities. The approach uses teacher models to create structured difficulty progressions and curriculum-based training, showing consistent improvements in code generation across Llama3.1-8B and Qwen models.

🧠 Llama
AINeutralarXiv – CS AI · Mar 266/10
🧠

Retrieval Improvements Do Not Guarantee Better Answers: A Study of RAG for AI Policy QA

A research study on retrieval-augmented generation (RAG) systems for AI policy analysis found that improving retrieval quality doesn't necessarily lead to better question-answering performance. The research used 947 AI policy documents and discovered that stronger retrieval can paradoxically cause more confident hallucinations when relevant information is missing.

AIBullisharXiv – CS AI · Mar 266/10
🧠

Beyond Multi-Token Prediction: Pretraining LLMs with Future Summaries

Researchers propose Future Summary Prediction (FSP), a new pretraining method for large language models that predicts compact representations of long-term future text sequences. FSP outperforms traditional next-token prediction and multi-token prediction methods in math, reasoning, and coding benchmarks when tested on 3B and 8B parameter models.

AINeutralarXiv – CS AI · Mar 266/10
🧠

From Sycophancy to Sensemaking: Premise Governance for Human-AI Decision Making

Researchers propose a new framework for human-AI decision making that shifts from AI systems providing fluent but potentially sycophantic answers to collaborative premise governance. The approach uses discrepancy-driven control loops to detect conflicts and ensure commitment to decision-critical premises before taking action.

AINeutralarXiv – CS AI · Mar 176/10
🧠

The AI Fiction Paradox

A new research paper identifies the 'AI-Fiction Paradox' - AI models desperately need fiction for training data but struggle to generate quality fiction themselves. The paper outlines three core challenges: narrative causation requiring temporal paradoxes, informational revaluation that conflicts with current attention mechanisms, and multi-scale emotional architecture that current AI cannot orchestrate effectively.

AINeutralarXiv – CS AI · Mar 176/10
🧠

Supervised Fine-Tuning versus Reinforcement Learning: A Study of Post-Training Methods for Large Language Models

A comprehensive research study examines the relationship between Supervised Fine-Tuning (SFT) and Reinforcement Learning (RL) methods for improving Large Language Models after pre-training. The research identifies emerging trends toward hybrid post-training approaches that combine both methods, analyzing applications from 2023-2025 to establish when each method is most effective.

AINeutralarXiv – CS AI · Mar 176/10
🧠

Why Do LLM-based Web Agents Fail? A Hierarchical Planning Perspective

Researchers propose a hierarchical planning framework to analyze why LLM-based web agents fail at complex navigation tasks. The study reveals that while structured PDDL plans outperform natural language plans, low-level execution and perceptual grounding remain the primary bottlenecks rather than high-level reasoning.

AIBullisharXiv – CS AI · Mar 176/10
🧠

Agentic Retoucher for Text-To-Image Generation

Researchers introduce Agentic Retoucher, a new AI framework that fixes common distortions in text-to-image generation through a three-agent system for perception, reasoning, and correction. The system outperformed existing methods on a new 27K-image dataset, potentially improving the quality and reliability of AI-generated images.

AIBullisharXiv – CS AI · Mar 176/10
🧠

Argumentation for Explainable and Globally Contestable Decision Support with LLMs

Researchers introduce ArgEval, a new framework that enhances Large Language Model decision-making through structured argumentation and global contestability. Unlike previous approaches limited to binary choices and local corrections, ArgEval maps entire decision spaces and builds reusable argumentation frameworks that can be globally modified to prevent repeated mistakes.

AINeutralarXiv – CS AI · Mar 176/10
🧠

Gradient Atoms: Unsupervised Discovery, Attribution and Steering of Model Behaviors via Sparse Decomposition of Training Gradients

Researchers introduce Gradient Atoms, an unsupervised method that decomposes AI model training gradients to discover interpretable behaviors without requiring predefined queries. The technique can identify model behaviors like refusal patterns and arithmetic capabilities, while also serving as effective steering vectors to control model outputs.

AIBullisharXiv – CS AI · Mar 176/10
🧠

OpenHospital: A Thing-in-itself Arena for Evolving and Benchmarking LLM-based Collective Intelligence

Researchers introduce OpenHospital, a new interactive arena designed to develop and benchmark Large Language Model-based Collective Intelligence through physician-patient agent interactions. The platform uses a data-in-agent-self paradigm to rapidly enhance AI agent capabilities while providing evaluation metrics for medical proficiency and system efficiency.

AIBullisharXiv – CS AI · Mar 176/10
🧠

Advancing Multimodal Agent Reasoning with Long-Term Neuro-Symbolic Memory

Researchers introduced NS-Mem, a neuro-symbolic memory framework that combines neural representations with symbolic structures to improve multimodal AI agent reasoning. The system achieved 4.35% average improvement in reasoning accuracy over pure neural systems, with up to 12.5% gains on constrained reasoning tasks.

AIBullisharXiv – CS AI · Mar 176/10
🧠

Learning from Partial Chain-of-Thought via Truncated-Reasoning Self-Distillation

Researchers introduce Truncated-Reasoning Self-Distillation (TRSD), a post-training method that enables AI language models to maintain accuracy while using shorter reasoning traces. The technique reduces computational costs by training models to produce correct answers from partial reasoning, achieving significant inference-time efficiency gains without sacrificing performance.

AINeutralarXiv – CS AI · Mar 176/10
🧠

Feature-level Interaction Explanations in Multimodal Transformers

Researchers introduce FL-I2MoE, a new Mixture-of-Experts layer for multimodal Transformers that explicitly identifies synergistic and redundant cross-modal feature interactions. The method provides more interpretable explanations for how different data modalities contribute to AI decision-making compared to existing approaches.

AIBullisharXiv – CS AI · Mar 176/10
🧠

Resolving Interference (RI): Disentangling Models for Improved Model Merging

Researchers have developed Resolving Interference (RI), a new framework that improves AI model merging by reducing cross-task interference when combining specialized models. The method makes models functionally orthogonal to other tasks using only unlabeled data, improving merging performance by up to 3.8% and generalization by up to 2.3%.

AIBullisharXiv – CS AI · Mar 176/10
🧠

REFINE-DP: Diffusion Policy Fine-tuning for Humanoid Loco-manipulation via Reinforcement Learning

Researchers developed REFINE-DP, a hierarchical framework that combines diffusion policies with reinforcement learning to enable humanoid robots to perform complex loco-manipulation tasks. The system achieves over 90% success rate in simulation and demonstrates smooth autonomous execution in real-world environments for tasks like door traversal and object transport.

AIBullisharXiv – CS AI · Mar 176/10
🧠

Not All Latent Spaces Are Flat: Hyperbolic Concept Control

Researchers introduced HyCon, a hyperbolic control mechanism for text-to-image models that provides better safety controls by steering generation away from unsafe content. The technique uses hyperbolic representation spaces instead of traditional Euclidean adjustments, achieving state-of-the-art results across multiple safety benchmarks.

AINeutralarXiv – CS AI · Mar 176/10
🧠

Deeper Thought, Weaker Aim: Understanding and Mitigating Perceptual Impairment during Reasoning in Multimodal Large Language Models

Researchers have identified that multimodal large language models (MLLMs) lose visual focus during complex reasoning tasks, with attention becoming scattered across images rather than staying on relevant regions. They propose a training-free Visual Region-Guided Attention (VRGA) framework that improves visual grounding and reasoning accuracy by reweighting attention to question-relevant areas.

AIBullisharXiv – CS AI · Mar 176/10
🧠

VLA-Thinker: Boosting Vision-Language-Action Models through Thinking-with-Image Reasoning

Researchers introduce VLA-Thinker, a new AI framework that enhances Vision-Language-Action models by enabling dynamic visual reasoning during robotic tasks. The system achieved a 97.5% success rate on LIBERO benchmarks through a two-stage training pipeline combining supervised fine-tuning and reinforcement learning.

AIBullisharXiv – CS AI · Mar 176/10
🧠

MVHOI: Bridge Multi-view Condition to Complex Human-Object Interaction Video Reenactment via 3D Foundation Model

Researchers introduce MVHOI, a new AI framework that significantly improves human-object interaction video generation by handling complex 3D manipulations through a two-stage process using 3D foundation models. The system can create realistic long-duration videos showing intricate object manipulations from multiple viewpoints, addressing limitations of existing approaches that struggle with non-planar movements.

AIBullisharXiv – CS AI · Mar 176/10
🧠

CATFormer: When Continual Learning Meets Spiking Transformers With Dynamic Thresholds

Researchers introduce CATFormer, a new spiking neural network architecture that solves catastrophic forgetting in continual learning through dynamic threshold neurons. The framework uses context-adaptive thresholds and task-agnostic inference to maintain knowledge across multiple learning tasks without performance degradation.

AINeutralarXiv – CS AI · Mar 176/10
🧠

Conceptual Views of Neural Networks: A Framework for Neuro-Symbolic Analysis

Researchers introduce 'conceptual views' as a formal framework based on Formal Concept Analysis to globally explain neural networks. Testing on 24 ImageNet models and Fruits-360 datasets shows the framework can faithfully represent models, enable architecture comparison, and extract human-comprehensible rules from neurons.

AIBullisharXiv – CS AI · Mar 176/10
🧠

Slow-Fast Policy Optimization: Reposition-Before-Update for LLM Reasoning

Researchers introduce Slow-Fast Policy Optimization (SFPO), a new reinforcement learning framework that improves training stability and efficiency for large language model reasoning. SFPO outperforms existing methods like GRPO by up to 2.80 points on math benchmarks while requiring up to 4.93x fewer rollouts and 4.19x less training time.

← PrevPage 8 of 17Next →