y0news
AnalyticsDigestsSourcesRSSAICrypto
#automated-synthesis1 article
1 articles
AINeutralarXiv โ€“ CS AI ยท Feb 276/107
๐Ÿง 

SPM-Bench: Benchmarking Large Language Models for Scanning Probe Microscopy

Researchers have developed SPM-Bench, a PhD-level benchmark for testing large language models on scanning probe microscopy tasks. The benchmark uses automated data synthesis from scientific papers and introduces new evaluation metrics to assess AI reasoning capabilities in specialized scientific domains.