14 articles tagged with #sample-efficiency. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.
AIBullisharXiv โ CS AI ยท 2d ago7/10
๐ง Researchers introduce Zero-shot Visual World Models (ZWM), a computational framework inspired by how young children learn physical understanding from minimal data. The approach combines sparse prediction, causal inference, and compositional reasoning to achieve data-efficient learning, demonstrating that AI systems can match child development patterns while learning from single-child observational data.
AIBullisharXiv โ CS AI ยท Mar 97/10
๐ง Researchers introduce COLD-Steer, a training-free framework that enables efficient control of large language model behavior at inference time using just a few examples. The method approximates gradient descent effects without parameter updates, achieving 95% steering effectiveness while using 50 times fewer samples than existing approaches.
AIBullisharXiv โ CS AI ยท Mar 37/103
๐ง Researchers have developed a new approach called Model Predictive Adversarial Imitation Learning that combines inverse reinforcement learning with model predictive control to enable AI agents to learn from incomplete human demonstrations. The method shows significant improvements in sample efficiency, generalization, and robustness compared to traditional imitation learning approaches.
AIBullisharXiv โ CS AI ยท Mar 37/103
๐ง Researchers have developed Curvature-Aware Policy Optimization (CAPO), a new algorithm that improves training stability and sample efficiency for Large Language Models by up to 30x. The method uses advanced mathematical optimization techniques to identify and filter problematic training samples, requiring intervention on fewer than 8% of tokens.
AINeutralarXiv โ CS AI ยท 3d ago6/10
๐ง Researchers introduce WOMBET, a framework that improves reinforcement learning efficiency in robotics by generating synthetic training data from a world model in source tasks and selectively transferring it to target tasks. The approach combines offline-to-online learning with uncertainty-aware planning to reduce data collection costs while maintaining robustness.
AIBullisharXiv โ CS AI ยท 3d ago6/10
๐ง Researchers propose a neuro-symbolic deep reinforcement learning approach that integrates logical rules and symbolic knowledge to improve sample efficiency and generalization in RL systems. The method transfers partial policies from simple tasks to complex ones, reducing training data requirements and improving performance in sparse-reward environments compared to existing baselines.
AIBullisharXiv โ CS AI ยท Mar 176/10
๐ง Researchers propose MA-VLCM, a framework that uses pretrained vision-language models as centralized critics in multi-agent reinforcement learning instead of learning critics from scratch. This approach significantly improves sample efficiency and enables zero-shot generalization while producing compact policies suitable for resource-constrained robots.
AIBullisharXiv โ CS AI ยท Mar 176/10
๐ง Researchers introduce XQC, a deep reinforcement learning algorithm that achieves state-of-the-art sample efficiency by optimizing the critic network's condition number through batch normalization, weight normalization, and distributional cross-entropy loss. The method outperforms existing approaches across 70 continuous control tasks while using fewer parameters.
AIBullisharXiv โ CS AI ยท Mar 37/109
๐ง Researchers introduce HiMAC, a hierarchical reinforcement learning framework that improves LLM agent performance on long-horizon tasks by separating macro-level planning from micro-level execution. The approach demonstrates state-of-the-art results across multiple environments, showing that structured hierarchy is more effective than simply scaling model size for complex agent tasks.
AIBullisharXiv โ CS AI ยท Mar 37/108
๐ง Researchers propose EfficientZero-Multitask (EZ-M), a multi-task model-based reinforcement learning algorithm that scales the number of tasks rather than samples per task for robotics training. The approach achieves state-of-the-art performance on HumanoidBench with significantly higher sample efficiency by leveraging shared world models across diverse tasks.
AIBullisharXiv โ CS AI ยท Feb 276/107
๐ง Researchers propose a new approach to generalized planning that learns explicit transition models rather than directly predicting action sequences. This method achieves better out-of-distribution performance with fewer training instances and smaller models compared to Transformer-based planners like PlanGPT.
AINeutralarXiv โ CS AI ยท Mar 34/104
๐ง Researchers introduce COMBRL, a new reinforcement learning algorithm designed for continuous-time systems using nonlinear ordinary differential equations. The algorithm achieves sublinear regret and better sample efficiency compared to existing methods by combining probabilistic models with uncertainty-aware exploration.
AINeutralarXiv โ CS AI ยท Mar 24/106
๐ง Researchers propose ACWI, a new reinforcement learning framework that dynamically balances intrinsic and extrinsic rewards through adaptive scaling coefficients. The system uses a lightweight Beta Network to optimize exploration in sparse reward environments, demonstrating improved sample efficiency and stability in MiniGrid experiments.
AINeutralarXiv โ CS AI ยท Mar 24/106
๐ง Researchers propose OVMSE, a new framework for Offline-to-Online Multi-Agent Reinforcement Learning that addresses key challenges in transitioning from offline training to online fine-tuning. The framework introduces Offline Value Function Memory and Sequential Exploration strategies to improve sample efficiency and performance in multi-agent environments.