y0news
← Feed
Back to feed
🧠 AI NeutralImportance 6/10

Can VLMs Reason Robustly? A Neuro-Symbolic Investigation

arXiv – CS AI|Weixin Chen, Antonio Vergari, Han Zhao|
🤖AI Summary

Researchers investigated whether Vision-Language Models (VLMs) can reason robustly under distribution shifts and found that fine-tuned VLMs achieve high accuracy in-distribution but fail to generalize. They propose VLC, a neuro-symbolic method combining VLM-based concept recognition with circuit-based symbolic reasoning that demonstrates consistent performance under covariate shifts.

Key Takeaways
  • Fine-tuned VLMs achieve high in-distribution accuracy but fail to generalize under covariate shifts in visual reasoning tasks.
  • Traditional gradient-based end-to-end training does not reliably induce underlying reasoning functions in VLMs.
  • Recent neuro-symbolic approaches with black-box reasoning components still exhibit inconsistent robustness across tasks.
  • The proposed VLC method decouples perception from reasoning by combining VLM concept recognition with circuit-based symbolic execution.
  • VLC consistently achieves strong performance under covariate shifts across three distinct visual deductive reasoning tasks.
Read Original →via arXiv – CS AI
Act on this with AI
Stay ahead of the market.
Connect your wallet to an AI agent. It reads balances, proposes swaps and bridges across 15 chains — you keep full control of your keys.
Connect Wallet to AI →How it works
Related Articles