#scientific-discovery News & Analysis

63 articles tagged with #scientific-discovery. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

63 articles

AIBullisharXiv – CS AI · Mar 37/102

🧠

The FM Agent

Researchers have developed FM Agent, a multi-agent AI framework that combines large language models with evolutionary search to autonomously solve complex research problems. The system achieved state-of-the-art results across multiple domains including operations research, machine learning, and GPU optimization without human intervention.

AIBullisharXiv – CS AI · Feb 277/106

🧠

Discovery of Interpretable Physical Laws in Materials via Language-Model-Guided Symbolic Regression

Researchers have developed a new framework that uses large language models to guide symbolic regression in discovering interpretable physical laws from high-dimensional materials data. The method reduces the search space by approximately 10^5 times compared to traditional approaches and successfully identified novel formulas for key properties of perovskite materials.

AIBullishOpenAI News · Feb 137/106

🧠

GPT-5.2 derives a new result in theoretical physics

OpenAI's GPT-5.2 has independently derived a new mathematical formula for gluon amplitude in theoretical physics, which was subsequently formally proved and verified by OpenAI and academic collaborators. This represents a significant advancement in AI's capability to contribute to fundamental scientific research and discovery.

AIBullishGoogle DeepMind Blog · Feb 97/105

🧠

Accelerating Mathematical and Scientific Discovery with Gemini Deep Think

Google's Gemini Deep Think is demonstrating significant impact across mathematical and scientific research fields according to emerging research papers. The AI system is accelerating discovery processes in various academic and research domains.

AIBullishMIT News – AI · Feb 27/108

🧠

How generative AI can help scientists synthesize complex materials

MIT researchers developed DiffSyn, a generative AI model that provides recipes for synthesizing new materials. This breakthrough could accelerate scientific experimentation by reducing the time from hypothesis to practical application.

AIBullishGoogle DeepMind Blog · Nov 247/105

🧠

Google DeepMind supports U.S. Department of Energy on Genesis: a national mission to accelerate innovation and scientific discovery

Google DeepMind has partnered with the U.S. Department of Energy on Genesis, a new national initiative designed to accelerate scientific discovery and innovation through artificial intelligence. This collaboration represents a significant government-private sector partnership in advancing AI applications for scientific research.

AIBullishOpenAI News · Nov 207/106

🧠

Early experiments in accelerating science with GPT-5

OpenAI has released the first research cases demonstrating how GPT-5 accelerates scientific discovery across mathematics, physics, biology, and computer science. The AI system is shown collaborating with researchers to generate mathematical proofs, uncover new insights, and significantly increase the pace of scientific progress.

AINeutralarXiv – CS AI · Jun 236/10

🧠

AI Scientists as Engines of Discovery: A Case for Development within Reformed Institutions

Researchers propose that agentic AI systems are transitioning from computational tools into autonomous "AI scientists" capable of accelerating scientific discovery across literature synthesis, hypothesis generation, and model verification. The paper argues this requires fundamental institutional reforms around verification, accountability, and safety, and introduces Denario as a prototype multi-agent framework that can explore hypothesis spaces beyond human capability.

AINeutralarXiv – CS AI · Jun 236/10

🧠

Active Causal Experimentalist (ACE): Learning Intervention Strategies via Direct Preference Optimization

Researchers introduce Active Causal Experimentalist (ACE), a machine learning system that learns optimal experimental design strategies using Direct Preference Optimization rather than traditional reward-based approaches. ACE achieves 70-71% improvement over baseline methods by comparing intervention pairs instead of absolute rewards, and autonomously discovers theoretically-grounded experimental strategies like concentrated interventions on parent variables in collider mechanisms.

AIBullisharXiv – CS AI · Jun 236/10

🧠

Negative Knowledge as Failure-aware Shared Memory for AutoResearch

Researchers propose a 'negative knowledge' memory system for AI-assisted research that captures and structures failed experiments as reusable knowledge assets. The approach outperforms baseline AutoResearch systems while reducing token usage, and demonstrates transfer learning capabilities across different scientific problems in nonlinear PDE research.

AINeutralarXiv – CS AI · Jun 116/10

🧠

ATLAS: Active Theory Learning for Automated Science

Researchers introduce ATLAS, an active learning framework that automates scientific discovery by iteratively generating mechanistic hypotheses and designing optimal experiments to distinguish between them. Tested on reinforcement learning agents, ATLAS achieves 5-10x improvement in sample efficiency compared to random experimentation, demonstrating significant potential for accelerating human-interpretable insights in cognitive science and other mechanistic modeling domains.

AINeutralarXiv – CS AI · Jun 116/10

🧠

StatefulDiscovery: Evidence-Calibrated Claim Formation in Open-Ended Scientific Discovery

Researchers introduce StatefulDiscovery, a framework that enables AI agents to conduct open-ended scientific discovery by maintaining explicit investigation state and coupling it with evidence-calibrated claim formation. The system addresses the challenge of avoiding overinterpretation by coordinating exploration trajectory with evidential support, demonstrated across 40 real-data tasks where it outperformed baseline approaches in producing well-supported, high-value claims.

AINeutralarXiv – CS AI · Jun 106/10

🧠

Towards Diverse Scientific Hypothesis Search with Large Language Models

Researchers propose a new evolutionary framework for using large language models to generate diverse, high-quality scientific hypotheses by reformulating the search as a sampling problem inspired by parallel tempering. The approach addresses a critical limitation where traditional optimization-focused methods collapse into homogeneous solutions, enabling scientists to maintain multiple robust candidate hypotheses under fixed validation budgets across molecular, equation, and algorithm discovery domains.

AINeutralarXiv – CS AI · Jun 106/10

🧠

ASyMOB: Algebraic Symbolic Mathematical Operations Benchmark

Researchers introduce ASyMOB, a 35,368-problem benchmark dataset for evaluating large language models on symbolic mathematics tasks. The dataset uses systematic perturbations to test genuine reasoning rather than pattern memorization, revealing that most models fail under minor problem variations while hybrid LLM-computer algebra system approaches show promise for scientific computing applications.

AIBullisharXiv – CS AI · Jun 96/10

🧠

SciTrace: Trajectory-Aware Safety Reasoning for Scientific Discovery Agents

Researchers introduce SciTrace, a framework that integrates safety reasoning throughout LLM-based scientific agent pipelines rather than as a post-hoc filter. The system detects compositional risks from multi-step tool sequences that single-stage monitors miss, achieving state-of-the-art safety across six scientific domains while maintaining output quality.

AINeutralarXiv – CS AI · Jun 96/10

🧠

Self-Evolving Scientific Agent Discovers Generalizable Physically-Reasoned Fluid Control

Researchers developed a self-evolving scientific agent powered by large language models that autonomously discovers interpretable control policies for complex physical systems. The system successfully solved an underactuated fluid-dynamics problem (dogfish swimmer navigation) by iteratively testing strategies, diagnosing behaviors, and refining source code—achieving generalization to unseen targets without retraining.

AINeutralarXiv – CS AI · Jun 96/10

🧠

FunctionEvolve: Structure-Guided Symbolic Regression with LLMs

FunctionEvolve is a new evolutionary framework that combines expression trees with LLM guidance to recover exact mathematical equations from data, achieving 82.9% accuracy on synthetic benchmarks—significantly outperforming prior symbolic regression methods by making the search process structure-aware rather than structure-blind.

🧠 Claude🧠 Opus

AIBullishMIT News – AI · Jun 46/10

🧠

NSF renews support for MIT-led AI and physics institute, expanding a new model for discovery

The National Science Foundation has renewed funding for MIT's Institute for AI and Fundamental Physics (IAIFI), moving into its second phase with expanded resources and ambitions. This renewal signals sustained institutional commitment to bridging artificial intelligence and physics research, establishing a model for collaborative discovery that could influence how fundamental science is conducted.

AINeutralarXiv – CS AI · Jun 16/10

🧠

Auto-Discovery-Bench: Diagnosing Structured State Tracking in Oracle-Guided Discovery

Researchers introduce Auto-Discovery-Bench, a diagnostic benchmark that tests AI agents' ability to maintain and update structured beliefs through iterative hypothesis-intervention-feedback cycles. The benchmark reveals that performance degrades significantly with increased complexity variables, and identifies limitations in long-range structured information integration as a key bottleneck for scientific discovery agents.

AINeutralarXiv – CS AI · May 296/10

🧠

ProjectionBench: Evaluating Scientific Hypothesis Generation in LLMs Under Progressive Information Disclosure

Researchers introduce ProjectionBench, a novel evaluation framework that tests large language models' scientific discovery capabilities by progressively revealing information about research problems. The benchmark assesses both innovative reasoning with minimal context and grounded hypothesis generation with full experimental details across 45 materials science papers, finding that GPT-5.4 and Gemini 3.1 Pro achieve strong alignment with ground-truth conclusions.

🧠 GPT-5🧠 Gemini

AINeutralarXiv – CS AI · May 296/10

🧠

Influence-Guided Symbolic Regression: Scientific Discovery via LLM-Driven Equation Search with Granular Feedback

Researchers introduce Influence-Guided Symbolic Regression (IGSR), a novel framework combining LLMs with Monte Carlo Tree Search to discover scientific equations more efficiently. The method uses granular influence scores to evaluate which components of equations contribute to accuracy, enabling systematic refinement. The approach demonstrated genuine discovery potential by identifying a novel relationship between DNA methylation and RNA Polymerase II pausing that was subsequently validated experimentally.

AINeutralarXiv – CS AI · May 296/10

🧠

MOOSE-Copilot: A Web-Based Interactive Assistant for Unified Exploratory and Fine-Grained Scientific Hypothesis Discovery

MOOSE-Copilot introduces a unified framework for scientific hypothesis discovery that combines exploratory ideation with fine-grained refinement through structured human-AI interaction. The web-based system enables scientists to guide LLM-powered discovery processes via initial blueprints, routing decisions, and feedback mechanisms, outperforming autonomous baselines while lowering accessibility barriers through an intuitive visual interface.

🏢 Microsoft

AIBullishGoogle DeepMind Blog · May 176/10

🧠

Gemini for Science: AI experiments and tools for a new era of discovery

Google has launched Gemini for Science, a collection of AI-powered tools and experiments designed to accelerate scientific discovery and research across multiple disciplines. The initiative aims to enhance the scale and precision of scientific exploration by leveraging advanced AI capabilities.

🧠 Gemini

AIBearisharXiv – CS AI · May 126/10

🧠

Agentic AI Scientists Are Not Built For Autonomous Scientific Discovery

A new position paper argues that despite functioning as useful co-scientists, agentic AI systems are fundamentally not designed for truly autonomous scientific discovery due to challenges in problem selection bias, insufficient tacit knowledge in training data, compressed output diversity, and lack of real-world experimental feedback loops.

AINeutralarXiv – CS AI · May 126/10

🧠

MaD Physics: Evaluating information seeking under constraints in physical environments

Researchers introduce MaD Physics, a benchmark for evaluating AI agents' ability to conduct scientific discovery under realistic resource constraints. The benchmark tests agents' capacity to make informative measurements within budget limits and infer underlying physical laws, using altered physics environments to prevent reliance on training data.

🧠 Gemini

← PrevPage 2 of 3Next →