y0news
AnalyticsDigestsSourcesTopicsRSSAICrypto

#partial-observability News & Analysis

10 articles tagged with #partial-observability. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

10 articles
AIBullisharXiv – CS AI · 2d ago7/10
🧠

Flow Equivariant World Models: Memory for Partially Observed Dynamic Environments

Researchers introduce Flow Equivariant World Models, a framework that uses time-parameterized symmetries to improve how AI systems predict dynamics in partially observed environments. The approach significantly outperforms existing diffusion and recurrent models by maintaining equivariant memory structures that track both observed and unobserved regions as they evolve over time.

AIBullisharXiv – CS AI · Mar 167/10
🧠

Guided Policy Optimization under Partial Observability

Researchers introduce Guided Policy Optimization (GPO), a new reinforcement learning framework that addresses challenges in partially observable environments by co-training a guider with privileged information and a learner through imitation learning. The method demonstrates theoretical optimality comparable to direct RL and shows strong empirical performance across various tasks including continuous control and memory-based challenges.

AIBullisharXiv – CS AI · Mar 57/10
🧠

ELMUR: External Layer Memory with Update/Rewrite for Long-Horizon RL Problems

Researchers developed ELMUR, a new AI architecture that uses external memory to help robots make better decisions over extremely long time periods. The system achieved 100% success on tasks requiring memory of up to one million steps and nearly doubled performance on robotic manipulation tasks compared to existing methods.

AINeutralarXiv – CS AI · 1d ago6/10
🧠

(HB-ARFM) History-Bootstrapped Flow Matching for Inverse Boiling Reconstruction

Researchers introduce History-Bootstrapped Flow Matching (HB-ARFM), a machine learning method for reconstructing complete spatiotemporal fields from partial observations, demonstrating particular success in recovering velocity and temperature fields from limited boiling dynamics data. The approach addresses a fundamental challenge in scientific inference where incomplete observations create ill-posed inverse problems that traditional single-timestep models cannot solve.

AINeutralarXiv – CS AI · 5d ago6/10
🧠

Recurrent Structural Policy Gradient for Partially Observable Mean Field Games

Researchers introduce Recurrent Structural Policy Gradient (RSPG), an algorithmic advancement for solving Mean Field Games with partial observability by combining policy gradient methods with structural knowledge of system dynamics. The method achieves significantly faster convergence than model-free approaches while enabling history-aware behavior, accompanied by MFAX, a new JAX-based research framework for MFG implementations.

AINeutralarXiv – CS AI · May 276/10
🧠

When Does Adaptive Guidance Help? Belief-Aware Privileged Distillation for Autonomous Driving Under Partial Observability

Researchers present Belief-Aware GSAC, an adaptive knowledge distillation method for autonomous driving that modulates teacher guidance based on ensemble disagreement. Testing reveals that adaptive guidance helps under mild-to-moderate partial observability but fails under severe occlusion due to 'observability blindness'—where ensembles achieve low disagreement on visible data while missing occluded information.

AINeutralarXiv – CS AI · May 116/10
🧠

Multi-Environment POMDPs with Finite-Horizon Objectives

Researchers establish that computing optimal policies for Multi-Environment POMDPs with finite-horizon objectives remains PSPACE-complete, matching the complexity of standard POMDPs. The work introduces a practical algorithm that substantially outperforms prior methods on benchmark problems.

AINeutralarXiv – CS AI · Apr 146/10
🧠

LLMs for Text-Based Exploration and Navigation Under Partial Observability

Researchers evaluated whether large language models can function as text-only controllers for navigation and exploration in unknown environments under partial observability. Testing nine contemporary LLMs on ASCII gridworld tasks, they found reasoning-tuned models reliably complete navigation goals but remain inefficient compared to optimal paths, with few-shot prompting reducing invalid moves and improving path efficiency.