Hallucination Behavior in Multimodal LLMs Across Agricultural Image Interpretation and Generation Tasks
A comprehensive study reveals that multimodal large language models exhibit significant hallucination problems in agricultural imaging tasks, with image interpretation achieving only 63-75% zero-shot accuracy and text-to-image generation producing up to 91% biologically inconsistent scenes. These findings highlight critical reliability gaps that could undermine the trustworthiness of AI-driven agricultural platforms.