y0news
AnalyticsDigestsSourcesTopicsRSSAICrypto

#partial-observability News & Analysis

5 articles tagged with #partial-observability. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

5 articles
AIBullisharXiv โ€“ CS AI ยท Mar 167/10
๐Ÿง 

Guided Policy Optimization under Partial Observability

Researchers introduce Guided Policy Optimization (GPO), a new reinforcement learning framework that addresses challenges in partially observable environments by co-training a guider with privileged information and a learner through imitation learning. The method demonstrates theoretical optimality comparable to direct RL and shows strong empirical performance across various tasks including continuous control and memory-based challenges.

AIBullisharXiv โ€“ CS AI ยท Mar 57/10
๐Ÿง 

ELMUR: External Layer Memory with Update/Rewrite for Long-Horizon RL Problems

Researchers developed ELMUR, a new AI architecture that uses external memory to help robots make better decisions over extremely long time periods. The system achieved 100% success on tasks requiring memory of up to one million steps and nearly doubled performance on robotic manipulation tasks compared to existing methods.

AINeutralarXiv โ€“ CS AI ยท Mar 46/103
๐Ÿง 

What Capable Agents Must Know: Selection Theorems for Robust Decision-Making under Uncertainty

Researchers prove 'selection theorems' showing that AI agents achieving low regret on prediction tasks must develop internal predictive models and belief states. The work demonstrates that structured internal representations are mathematically necessary, not just helpful, for competent decision-making under uncertainty.

AINeutralarXiv โ€“ CS AI ยท 5d ago6/10
๐Ÿง 

LLMs for Text-Based Exploration and Navigation Under Partial Observability

Researchers evaluated whether large language models can function as text-only controllers for navigation and exploration in unknown environments under partial observability. Testing nine contemporary LLMs on ASCII gridworld tasks, they found reasoning-tuned models reliably complete navigation goals but remain inefficient compared to optimal paths, with few-shot prompting reducing invalid moves and improving path efficiency.