y0news
AnalyticsDigestsSourcesRSSAICrypto
#vlm-performance1 article
1 articles
AINeutralarXiv โ€“ CS AI ยท Feb 276/107
๐Ÿง 

PoSh: Using Scene Graphs To Guide LLMs-as-a-Judge For Detailed Image Descriptions

Researchers introduce PoSh, a new evaluation metric for detailed image descriptions that uses scene graphs to guide LLMs-as-a-Judge, achieving better correlation with human judgments than existing methods. They also present DOCENT, a challenging benchmark dataset featuring artwork with expert-written descriptions to evaluate vision-language models' performance on complex image analysis.