AINeutralarXiv – CS AI · 14h ago6/10
🧠
CrystalXRD-Bench: Benchmarking Vision-Language Models for XRD Peak Indexing Across Diverse Crystalline Materials
Researchers introduced CrystalXRD-Bench, a 250-sample benchmark dataset for evaluating vision-language models on crystallographic peak indexing from X-ray diffraction patterns. Despite testing seven leading VLMs, the best model achieved only 37.6% exact-match accuracy, revealing significant gaps in how AI systems handle precise scientific figure interpretation and multi-step reasoning.
🧠 GPT-5