AIBullisharXiv – CS AI · 7h ago6/10
🧠
MechVQA: Benchmarking and Enhancing Multimodal LLMs on Comprehensive Mechanical Drawing Understanding
Researchers introduce MechVQA, the first comprehensive dataset for evaluating multimodal large language models (MLLMs) on mechanical drawing understanding, containing 3.3k annotated drawings with 21k question-answer pairs across three capability levels. They develop MechVL, a domain-specialized model that outperforms existing baselines by 7.57 percentage points, establishing a foundation for deploying AI in mechanical design and engineering inspection workflows.