y0news
AnalyticsDigestsSourcesTopicsRSSAICrypto

#scene-understanding News & Analysis

4 articles tagged with #scene-understanding. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

4 articles
AIBullisharXiv โ€“ CS AI ยท Mar 37/105
๐Ÿง 

Vid-LLM: A Compact Video-based 3D Multimodal LLM with Reconstruction-Reasoning Synergy

Researchers propose Vid-LLM, a new video-based 3D multimodal large language model that processes video inputs without requiring external 3D data for scene understanding. The model uses a Cross-Task Adapter module and Metric Depth Model to integrate geometric cues and maintain consistency across 3D tasks like question answering and visual grounding.

AIBullisharXiv โ€“ CS AI ยท Mar 176/10
๐Ÿง 

Pixel-level Scene Understanding in One Token: Visual States Need What-is-Where Composition

Researchers propose CroBo, a new visual state representation learning framework that helps robotic agents better understand dynamic environments by encoding both semantic identities and spatial locations of scene elements. The framework uses a global-to-local reconstruction method that compresses observations into compact tokens, achieving state-of-the-art performance on robot policy learning benchmarks.

AIBullisharXiv โ€“ CS AI ยท Mar 26/1017
๐Ÿง 

LiteReality: Graphics-Ready 3D Scene Reconstruction from RGB-D Scans

Researchers have developed LiteReality, a novel pipeline that converts RGB-D scans of indoor environments into compact, realistic 3D virtual replicas suitable for AR/VR, gaming, robotics, and digital twins. The system features scene understanding, object retrieval, material painting, and physics integration to create graphics-ready environments that support object individuality and physically-based rendering.

AINeutralarXiv โ€“ CS AI ยท Apr 74/10
๐Ÿง 

TreeGaussian: Tree-Guided Cascaded Contrastive Learning for Hierarchical Consistent 3D Gaussian Scene Segmentation and Understanding

TreeGaussian introduces a new framework for 3D scene understanding that uses tree-guided cascaded contrastive learning to better capture hierarchical semantic relationships in complex 3D environments. The method addresses limitations in existing 3D Gaussian Splatting approaches by implementing structured learning across object-part hierarchies and improving segmentation consistency.