y0news
#physics2 articles
2 articles
AIBearisharXiv โ€“ CS AI ยท 4h ago4
๐Ÿง 

CMT-Benchmark: A Benchmark for Condensed Matter Theory Built by Expert Researchers

Researchers created CMT-Benchmark, a new dataset of 50 expert-level condensed matter theory problems to evaluate large language models' capabilities in advanced scientific research. The best performing model (GPT5) solved only 30% of problems, with the average across 17 models being just 11.4%, highlighting significant gaps in current AI's physical reasoning abilities.

AINeutralarXiv โ€“ CS AI ยท 4h ago0
๐Ÿง 

NuBench: An Open Benchmark for Deep Learning-Based Event Reconstruction in Neutrino Telescopes

NuBench is a new open benchmark for deep learning-based event reconstruction in neutrino telescopes, comprising seven large-scale simulated datasets with nearly 130 million neutrino interactions. The benchmark enables comparison of machine learning reconstruction methods across different detector geometries and evaluates four algorithms including ParticleNeT and DynEdge on core reconstruction tasks.