AIBullisharXiv – CS AI · Apr 147/10
🧠Researchers demonstrate that physics simulators can generate synthetic training data for large language models, enabling them to learn physical reasoning without relying on scarce internet QA pairs. Models trained on simulated data show 5-10 percentage point improvements on International Physics Olympiad problems, suggesting simulators offer a scalable alternative for domain-specific AI training.
AINeutralarXiv – CS AI · Apr 137/10
🧠Researchers introduce PilotBench, a benchmark evaluating large language models on safety-critical aviation tasks using 708 real-world flight trajectories. The study reveals a fundamental trade-off: traditional forecasters achieve superior numerical precision (7.01 MAE) while LLMs provide better instruction-following (86-89%) but with significantly degraded prediction accuracy (11-14 MAE), exposing brittleness in implicit physics reasoning for embodied AI applications.
AINeutralarXiv – CS AI · May 126/10
🧠Researchers introduce SeePhys Pro, a benchmark revealing that advanced AI models significantly degrade in physics reasoning when visual information replaces text, with visual grounding as the primary failure point. The study further demonstrates that multimodal reinforcement learning improvements can stem from non-visual textual cues rather than genuine visual understanding, challenging current evaluation methodologies.
AINeutralarXiv – CS AI · May 16/10
🧠Researchers have published a comprehensive survey on Physical AI that bridges the gap between physical perception and symbolic physics reasoning in AI systems. The work advocates for next-generation world models that integrate physical laws, embodied reasoning, and generative approaches to create AI systems with genuine understanding of physical phenomena rather than pure pattern recognition.
AINeutralarXiv – CS AI · Apr 76/10
🧠Researchers identify critical limitations in current Multimodal Large Language Models' ability to understand physics and physical world dynamics. They propose Scene Dynamic Field (SDF), a new approach using physics simulators that achieves up to 20.7% performance improvements on fluid dynamics tasks.
AINeutralarXiv – CS AI · Mar 176/10
🧠Researchers introduce the Infinite Problem Generator (IPG), an AI framework that creates verifiable physics problems using executable Python code instead of probabilistic text generation. The system released ClassicalMechanicsV1, a dataset of 1,335 physics problems that demonstrates how code complexity can precisely measure problem difficulty for training large language models.
AINeutralarXiv – CS AI · Mar 54/10
🧠Researchers trained a compact 1.5B parameter language model to solve beam physics problems using reinforcement learning with verifiable rewards, achieving 66.7% improvement in accuracy. However, the model learned pattern-matching templates rather than true physics reasoning, failing to generalize to topological changes despite mastering the same underlying equations.