AINeutralarXiv – CS AI · 10h ago6/10
🧠
CrossVL: Complexity-Aware Feature Routing and Paired Curriculum for Cross-View Vision-Language Detection
CrossVL introduces a novel framework combining Complexity-Aware Pathway Aggregation and Paired Curriculum Learning to improve vision-language model performance in cross-view object detection scenarios. The approach addresses fundamental challenges when models operate across different viewpoints (ground and aerial), achieving measurable improvements in detection accuracy and consistency on the MAVREC dataset.