y0news
AnalyticsDigestsSourcesTopicsRSSAICrypto

#offline-learning News & Analysis

13 articles tagged with #offline-learning. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

13 articles
AINeutralarXiv – CS AI · 4d ago6/10
🧠

EmoDistill: Offline Emotion Skill Distillation for Language Model Agents in Adversarial Negotiation

Researchers introduce EmoDistill, an offline framework that teaches language model agents to strategically use emotion in adversarial negotiations. The system decomposes emotional strategy into emotion selection and expression, with experiments showing that emotionally-framed language significantly shifts negotiation outcomes, suggesting emotion functions as a tactical tool rather than stylistic decoration.

AINeutralarXiv – CS AI · May 126/10
🧠

Large Language Models for Sequential Decision-Making: Improving In-Context Learning via Supervised Fine-Tuning

Researchers demonstrate that large language models can be effectively fine-tuned to perform sequential decision-making tasks across MDPs, POMDPs, and ambiguous environments by learning from offline trajectory data. The approach achieves stronger performance than baseline methods, particularly in complex, partially-observed scenarios, with theoretical analysis showing the fine-tuned attention mechanisms implicitly estimate optimal Q-functions.

AINeutralarXiv – CS AI · Apr 136/10
🧠

WOMBET: World Model-based Experience Transfer for Robust and Sample-efficient Reinforcement Learning

Researchers introduce WOMBET, a framework that improves reinforcement learning efficiency in robotics by generating synthetic training data from a world model in source tasks and selectively transferring it to target tasks. The approach combines offline-to-online learning with uncertainty-aware planning to reduce data collection costs while maintaining robustness.

AIBullisharXiv – CS AI · Apr 66/10
🧠

OPRIDE: Offline Preference-based Reinforcement Learning via In-Dataset Exploration

Researchers have developed OPRIDE, a new algorithm for offline preference-based reinforcement learning that significantly improves query efficiency. The algorithm addresses key challenges of inefficient exploration and overoptimization through principled exploration strategies and discount scheduling mechanisms.

AIBullisharXiv – CS AI · Mar 26/1015
🧠

OM2P: Offline Multi-Agent Mean-Flow Policy

Researchers propose OM2P, a new offline multi-agent reinforcement learning algorithm that achieves efficient one-step action sampling using mean-flow models. The approach delivers up to 3.8x reduction in GPU memory usage and 10.8x speed-up in training time compared to existing diffusion and flow-based models.

AIBullisharXiv – CS AI · Feb 276/106
🧠

LLM4Cov: Execution-Aware Agentic Learning for High-coverage Testbench Generation

Researchers have developed LLM4Cov, an offline learning framework that enables AI agents to generate high-coverage hardware verification testbenches without expensive online reinforcement learning. A compact 4B-parameter model achieved 69.2% coverage pass rate, outperforming larger models by demonstrating efficient learning from execution feedback in hardware verification tasks.

AINeutralarXiv – CS AI · Mar 174/10
🧠

Safe Flow Q-Learning: Offline Safe Reinforcement Learning with Reachability-Based Flow Policies

Researchers introduce Safe Flow Q-Learning (SafeFQL), a new offline safe reinforcement learning method that combines Hamilton-Jacobi reachability with flow policies for safety-critical real-time control. The method achieves better safety performance with lower inference latency compared to existing diffusion-based approaches, making it more suitable for real-time deployment.

AINeutralarXiv – CS AI · Mar 34/106
🧠

Conservative Equilibrium Discovery in Offline Game-Theoretic Multiagent Reinforcement Learning

Researchers developed COffeE-PSRO, a new algorithm that applies offline reinforcement learning to game-theoretic multiagent systems. The approach extends Policy Space Response Oracles by incorporating uncertainty quantification and conservative exploration to find equilibrium strategies from fixed datasets without online interaction.

AINeutralarXiv – CS AI · Mar 24/106
🧠

Offline-to-Online Multi-Agent Reinforcement Learning with Offline Value Function Memory and Sequential Exploration

Researchers propose OVMSE, a new framework for Offline-to-Online Multi-Agent Reinforcement Learning that addresses key challenges in transitioning from offline training to online fine-tuning. The framework introduces Offline Value Function Memory and Sequential Exploration strategies to improve sample efficiency and performance in multi-agent environments.