🧠 AI🟢 BullishImportance 6/10

Robust Zero-Shot Generalization for Open-Vocabulary Action Recognition via Task Arithmetic

arXiv – CS AI|Francesca Morandi, Omayma Moussadek, Federico Venturini, Mauro Suardi, Alessandro Banzatti, Francesco Cannarile, Angelo Porrello, Simone Calderara|June 23, 2026 at 04:00 AM

🤖AI Summary

Researchers propose a novel approach to Open Vocabulary Action Recognition (OVAR) using task arithmetic and model merging, enabling zero-shot generalization to novel actions without requiring costly domain-specific fine-tuning. By combining task vectors from models trained on diverse public datasets, the method achieves superior out-of-distribution performance while avoiding privacy and regulatory concerns associated with target-domain training.

Analysis

This research addresses a fundamental challenge in computer vision: recognizing actions outside predefined classes without expensive retraining on target domains. Traditional OVAR systems rely on vision-language models but typically degrade when encountering distribution shifts in real-world deployments. The proposed approach leverages task arithmetic—a technique that extracts learned task-specific information as vectors and recombines them—to create a more robust merged model without accessing target-domain data.

The significance lies in its practical implications. Real-world video analysis applications across security, autonomous systems, and content moderation frequently encounter novel action types that weren't in training data. Requiring domain-specific fine-tuning introduces computational overhead, creates data privacy issues, and triggers regulatory complications under frameworks like GDPR. This work bypasses those constraints entirely by operating purely in a zero-shot paradigm.

The technical innovation demonstrates that strategic knowledge recombination from diverse public sources outperforms reliance on generic pretrained models. This finding has broader applications beyond action recognition—model merging techniques are increasingly relevant as organizations seek to combine specialized models without full retraining. For developers and researchers, this represents a cost-effective path to deployment that maintains privacy compliance.

Looking forward, the validation of task arithmetic in OVAR opens questions about scaling to larger model ensembles and other vision-language tasks. The availability of code facilitates community adoption and extension, potentially establishing this as a standard approach for zero-shot generalization challenges across multimodal AI systems.

Key Takeaways

→Task arithmetic enables merging of diverse OVAR models to achieve superior zero-shot generalization without target-domain training.
→The approach eliminates privacy and regulatory concerns associated with domain-specific fine-tuning on sensitive video data.
→Knowledge recombination from public datasets outperforms single pretrained models in out-of-distribution action recognition scenarios.
→Model merging techniques offer practical cost savings by avoiding expensive retraining cycles in production systems.
→Open-source code release accelerates adoption of this paradigm for zero-shot generalization across vision-language tasks.

#action-recognition #zero-shot-learning #model-merging #task-arithmetic #vision-language #ovar #generalization #computer-vision

Read Original →via arXiv – CS AI

Act on this with AI

Stay ahead of the market.

Connect your wallet to an AI agent. It reads balances, proposes swaps and bridges across 15 chains — you keep full control of your keys.

Connect Wallet to AI →How it works

AIMay 6

Your company’s AI could delete everything in 9 seconds. ServiceNow wants to be the kill switch

AIMay 6

Hut 8 (HUT) Stock Soars 37% on Massive $9.8 Billion AI Data Center Agreement

AIMay 6

Robust Zero-Shot Generalization for Open-Vocabulary Action Recognition via Task Arithmetic

Your company’s AI could delete everything in 9 seconds. ServiceNow wants to be the kill switch

Hut 8 (HUT) Stock Soars 37% on Massive $9.8 Billion AI Data Center Agreement

S&P 500 and NASDAQ hit record highs as AI chip stocks surge