#physical-reasoning News & Analysis

7 articles tagged with #physical-reasoning. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

7 articles

AIBullisharXiv – CS AI · Jun 57/10

🧠

The Invisible Hand of Physics: When Video Diffusion Models Know More Than They Show

Researchers demonstrate that video diffusion models internally encode physical plausibility without explicit training to do so, achieving 81% accuracy in decoding physical validity from model states. This finding suggests generative AI systems develop meaningful representations of physics as an emergent property of the denoising process rather than through supervised learning.

AIBearisharXiv – CS AI · Jun 27/10

🧠

InPhyRe Discovers: Large Multimodal Models Struggle in Inductive Physical Reasoning

Researchers introduced InPhyRe, a new benchmark showing that large multimodal models (LMMs) struggle with inductive physical reasoning—their ability to apply learned physical laws to novel, unseen scenarios. Testing 13 LMMs revealed critical weaknesses: models fail to generalize parametric knowledge, perform poorly with unseen physical laws, and exhibit language bias that causes them to ignore visual inputs, raising concerns about their reliability for safety-critical applications.

AIBearisharXiv – CS AI · May 127/10

🧠

MDGYM: Benchmarking AI Agents on Molecular Simulations

Researchers introduced MDGYM, a benchmark testing AI agents' ability to autonomously execute molecular dynamics simulations, finding that even the strongest systems solve only 21% of easy tasks. The poor performance reveals that advanced code generation does not translate to physical reasoning, exposing a critical gap between general software engineering competence and domain-specific scientific workflows.

🧠 Claude

AIBullisharXiv – CS AI · Apr 137/10

🧠

PhysInOne: Visual Physics Learning and Reasoning in One Suite

PhysInOne is a large-scale synthetic dataset containing 2 million videos across 153,810 dynamic 3D scenes designed to address the scarcity of physics-grounded training data for AI systems. The dataset covers 71 physical phenomena and includes comprehensive annotations, demonstrating significant improvements in physics-aware video generation, prediction, and property estimation when used to fine-tune foundation models.

AIBearisharXiv – CS AI · Jun 26/10

🧠

Vision Language Models Cannot Reason About Physical Transformation

Researchers demonstrate that Vision Language Models systematically fail to understand physical transformations, revealing fundamental gaps in how these AI systems reason about dynamic environments. Through ConservationBench testing 112 VLMs on conservation principles, the study shows models perform near chance levels regardless of prompting strategies or temporal resolution, indicating they lack genuine comprehension of invariant physical properties rather than simply lacking training data.

AINeutralarXiv – CS AI · Jun 16/10

🧠

Physically Viable World Models: A Case for Query-Conditioned Embodied AI

Researchers propose that world models for embodied AI must be physically viable—designed to answer intervention queries by representing actual physical structures rather than just predicting observations. Current observation-predictive models fail because visually identical scenes can behave differently under intervention, potentially recommending unsafe or infeasible actions.

AINeutralarXiv – CS AI · Jun 16/10

🧠

BilliardPhys-Bench: Benchmarking Physical Reasoning and Visual Dynamics of Multimodal LLMs

Researchers introduced BilliardPhys-Bench, a benchmark that tests multimodal AI models' ability to predict physical interactions in billiards simulations. The evaluation reveals that leading LLMs from OpenAI, Anthropic, Google, and Alibaba struggle with dynamic physics reasoning, exhibiting systematic failures including a 'stasis bias' where models default to predicting no interaction when physical outcomes become difficult to infer.

🧠 Claude🧠 Gemini