AINeutralarXiv – CS AI · Jun 26/10
🧠GeoSAM-3D introduces a novel approach to 3D scene segmentation from monocular video by combining foundation models with Gaussian Splatting and geodesic propagation, enabling users to segment objects with simple clicks or text prompts without requiring RGB-D cameras or pre-reconstructed meshes.
AINeutralarXiv – CS AI · Jun 26/10
🧠Researchers introduce PSG-Nav, a novel navigation system that uses probabilistic scene graphs to help AI agents navigate complex environments while accounting for perception uncertainty. The system achieves state-of-the-art results on three major benchmarks by employing multiverse decision-making and an evidential calibrator to reduce false positives in open-vocabulary navigation tasks.
AINeutralarXiv – CS AI · May 96/10
🧠Researchers introduce Open-SAT, a training-free algorithm that uses Large Language Models to refine query embeddings for satellite image retrieval tasks. The method improves upon existing vision-language models by leveraging LLM-guided contextual refinement at inference time, achieving up to 16% F1 score improvement on open-vocabulary satellite imagery tasks without requiring additional training.
AINeutralarXiv – CS AI · May 76/10
🧠Ilov3Splat introduces a framework for understanding 3D scenes using natural language by combining 3D Gaussian Splatting with CLIP features and SAM masks. The method achieves better cross-view consistency and instance-level reasoning than prior approaches, enabling object identification without manual annotation.
AIBullisharXiv – CS AI · Feb 276/105
🧠Researchers have developed a framework that enables open vocabulary object detection models to operate in real-world settings by identifying and learning previously unseen objects. The method introduces techniques called Open World Embedding Learning (OWEL) and Multi-Scale Contrastive Anchor Learning (MSCAL) to detect unknown objects and reduce misclassification errors.
$NEAR
AINeutralarXiv – CS AI · Mar 54/10
🧠Researchers have developed a new AI method for open-vocabulary camouflaged instance segmentation (OVCIS) using diffusion models and text-to-image techniques. The approach addresses the challenge of detecting camouflaged objects by leveraging cross-domain textual-visual features, showing improvements over existing methods on benchmark datasets.