#open-vocabulary News & Analysis

8 articles tagged with #open-vocabulary. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

8 articles

AIBullisharXiv – CS AI · Jun 197/10

🧠

Lagrange: An Open-Vocabulary, Energy-Based Sparse Framework for Generalized End-to-End Driving

Researchers introduce Lagrange, an open-vocabulary autonomous driving framework that combines Vision-Language Models with sparse, energy-based planning to address limitations in existing end-to-end driving systems. The approach balances computational efficiency with generalization capacity for handling out-of-distribution scenarios while maintaining kinematic feasibility.

AINeutralarXiv – CS AI · Jun 116/10

🧠

LASA: A Weak Supervision Method for Open-Vocabulary Scene Sketch Semantic Segmentation

Researchers introduce LASA, a weak supervision method for open-vocabulary sketch semantic segmentation that aggregates multi-layer Vision Transformer attention maps to capture complementary spatial cues. The approach achieves significant improvements over baselines without requiring pixel-level annotations, advancing computer vision capabilities for sparse line drawing interpretation.

AINeutralarXiv – CS AI · Jun 26/10

🧠

GeoSAM-3D: Geodesic Prompt Propagation for Open-Vocabulary 3D Scene Segmentation from Monocular Video

GeoSAM-3D introduces a novel approach to 3D scene segmentation from monocular video by combining foundation models with Gaussian Splatting and geodesic propagation, enabling users to segment objects with simple clicks or text prompts without requiring RGB-D cameras or pre-reconstructed meshes.

AINeutralarXiv – CS AI · Jun 26/10

🧠

PSG-Nav: Probabilistic Scene Graph Navigation via Multiverse Decision Making

Researchers introduce PSG-Nav, a novel navigation system that uses probabilistic scene graphs to help AI agents navigate complex environments while accounting for perception uncertainty. The system achieves state-of-the-art results on three major benchmarks by employing multiverse decision-making and an evidential calibrator to reduce false positives in open-vocabulary navigation tasks.

AINeutralarXiv – CS AI · May 96/10

🧠

Open-SAT: LLM-Guided Query Embedding Refinement for Open-Vocabulary Object Retrieval in Satellite Imagery

Researchers introduce Open-SAT, a training-free algorithm that uses Large Language Models to refine query embeddings for satellite image retrieval tasks. The method improves upon existing vision-language models by leveraging LLM-guided contextual refinement at inference time, achieving up to 16% F1 score improvement on open-vocabulary satellite imagery tasks without requiring additional training.

AINeutralarXiv – CS AI · May 76/10

🧠

Ilov3Splat: Instance-Level Open-Vocabulary 3D Scene Understanding in Gaussian Splatting

Ilov3Splat introduces a framework for understanding 3D scenes using natural language by combining 3D Gaussian Splatting with CLIP features and SAM masks. The method achieves better cross-view consistency and instance-level reasoning than prior approaches, enabling object identification without manual annotation.

AIBullisharXiv – CS AI · Feb 276/105

🧠

From Open Vocabulary to Open World: Teaching Vision Language Models to Detect Novel Objects

Researchers have developed a framework that enables open vocabulary object detection models to operate in real-world settings by identifying and learning previously unseen objects. The method introduces techniques called Open World Embedding Learning (OWEL) and Multi-Scale Contrastive Anchor Learning (MSCAL) to detect unknown objects and reduce misclassification errors.

$NEAR

AINeutralarXiv – CS AI · Mar 54/10

🧠

Catch Me If You Can Describe Me: Open-Vocabulary Camouflaged Instance Segmentation with Diffusion

Researchers have developed a new AI method for open-vocabulary camouflaged instance segmentation (OVCIS) using diffusion models and text-to-image techniques. The approach addresses the challenge of detecting camouflaged objects by leveraging cross-domain textual-visual features, showing improvements over existing methods on benchmark datasets.