y0news
← Feed
←Back to feed
🧠 AI🟒 BullishImportance 6/10

TTOM: Test-Time Optimization and Memorization for Compositional Video Generation

arXiv – CS AI|Leigang Qu, Ziyang Wang, Na Zheng, Wenjie Wang, Liqiang Nie, Tat-Seng Chua||4 views
πŸ€–AI Summary

Researchers introduce TTOM (Test-Time Optimization and Memorization), a training-free framework that improves compositional video generation in Video Foundation Models during inference. The system uses layout-attention optimization and parametric memory to better align text prompts with generated video outputs, showing strong transferability across different scenarios.

Key Takeaways
  • β†’TTOM addresses compositional weaknesses in Video Foundation Models without requiring additional training.
  • β†’The framework uses a parametric memory mechanism supporting flexible operations like insert, read, update, and delete.
  • β†’TTOM demonstrates powerful transferability and generalization by disentangling compositional world knowledge.
  • β†’Experimental results on T2V-CompBench and Vbench benchmarks show the framework is effective and scalable.
  • β†’The approach focuses on spatiotemporal layout alignment during inference rather than direct intervention to latents.
Read Original β†’via arXiv – CS AI
Act on this with AI
Stay ahead of the market.
Connect your wallet to an AI agent. It reads balances, proposes swaps and bridges across 15 chains β€” you keep full control of your keys.
Connect Wallet to AI β†’How it works
Related Articles