EgoCross: Benchmarking Multimodal Large Language Models for Cross-Domain Egocentric Video Question Answering
Researchers introduce EgoCross, a new benchmark to evaluate multimodal AI models on egocentric video understanding across diverse domains like surgery, extreme sports, and industrial settings. The study reveals that current AI models, including specialized egocentric models, struggle with cross-domain generalization beyond common daily activities.