#3d-reconstruction News & Analysis

26 articles tagged with #3d-reconstruction. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

26 articles

AIBullisharXiv – CS AI · Mar 177/10

🧠

3D-LFM: Lifting Foundation Model

Researchers have developed the first 3D Lifting Foundation Model (3D-LFM) that can reconstruct 3D structures from 2D landmarks without requiring correspondence across training data. The model uses transformer architecture to achieve state-of-the-art performance across various object categories with resilience to occlusions and noise.

AIBullisharXiv – CS AI · Mar 117/10

🧠

World2Mind: Cognition Toolkit for Allocentric Spatial Reasoning in Foundation Models

Researchers introduce World2Mind, a training-free spatial intelligence toolkit that enhances foundation models' 3D spatial reasoning capabilities by up to 18%. The system uses 3D reconstruction and cognitive mapping to create structured spatial representations, enabling text-only models to perform complex spatial reasoning tasks.

🧠 GPT-5

AIBullisharXiv – CS AI · Mar 57/10

🧠

ZipMap: Linear-Time Stateful 3D Reconstruction with Test-Time Training

Researchers introduce ZipMap, a new AI model for 3D reconstruction that achieves linear-time processing while maintaining accuracy comparable to slower quadratic-time methods. The system can reconstruct over 700 frames in under 10 seconds on a single H100 GPU, making it more than 20x faster than current state-of-the-art approaches like VGGT.

AIBullisharXiv – CS AI · Mar 56/10

🧠

EgoWorld: Translating Exocentric View to Egocentric View using Rich Exocentric Observations

EgoWorld is a new AI framework that converts third-person camera views into first-person perspectives using 3D data and diffusion models. The technology addresses limitations in current methods and shows strong performance across multiple datasets, with applications in AR, VR, and robotics.

AINeutralarXiv – CS AI · 2d ago6/10

🧠

City-Mesh3R: Simulation-Ready City-Scale 3D Mesh Reconstruction from Multi-View Images

City-Mesh3R introduces a scalable framework for reconstructing high-fidelity 3D city-scale meshes directly from unordered image collections using a divide-and-conquer strategy. The method addresses limitations of existing NeRF and Gaussian Splatting approaches by producing watertight, simulation-ready meshes suitable for large urban scenes without prohibitive computational overhead.

AIBullisharXiv – CS AI · 4d ago6/10

🧠

E$^3$C: Video Generation with 3D Environmental Memory and Ego-Exo Human Pose Control

Researchers introduce E³C, a video diffusion framework enabling controllable egocentric video generation with 3D environmental memory and separate human pose controls for both camera wearers and observed subjects. The system addresses unique challenges in first-person video synthesis by maintaining scene consistency while handling rapid viewpoint changes and partial occlusions.

AINeutralarXiv – CS AI · 4d ago6/10

🧠

Unified Panoramic Geometry Estimation via Multi-View Foundation Models

Researchers introduce PaGeR, a framework that adapts 3D foundation models trained on perspective images to work with panoramic imagery, enabling geometry estimation from 360-degree scenes. The unified model predicts depth, surface normals, and sky masks from both standard and panoramic images in a single pass, achieving state-of-the-art performance on indoor and outdoor scenes.

AINeutralarXiv – CS AI · 4d ago6/10

🧠

MuNet: A Mutualistic Network for Joint 3D Human Mesh Recovery and 3D Clothed Human Reconstruction from Single Images

Researchers introduce MuNet, a unified deep learning framework that jointly optimizes 3D human mesh recovery and clothed human reconstruction from single images using graph convolutional networks. The approach leverages mutualistic feedback between the two tasks to achieve state-of-the-art results across six benchmark datasets, with code released for research purposes.

AINeutralarXiv – CS AI · May 126/10

🧠

PhysHanDI: Physics-Based Reconstruction of Hand-Deformable Object Interactions

PhysHanDI introduces a physics-based framework for reconstructing 3D hand-object interactions involving deformable materials like cloth and soft objects. By simulating physically plausible object deformations driven by hand movements and using inverse physics to refine hand reconstruction, the method achieves superior performance in reconstruction and prediction tasks compared to existing approaches.

AIBullisharXiv – CS AI · May 126/10

🧠

Geometric 4D Stitching for Grounded 4D Generation

Researchers introduce Geometric 4D Stitching, a novel framework that improves 4D scene generation by explicitly identifying and filling geometric gaps with geometrically consistent components. The method achieves efficient 4D scene reconstruction in under 10 minutes on consumer hardware while supporting iterative scene expansion and editing capabilities.

🏢 Nvidia

AINeutralarXiv – CS AI · May 116/10

🧠

DPG-CD: Depth-Prior-Guided Cross-Modal Joint 2D-3D Change Detection

Researchers introduce DPG-CD, a deep learning framework that detects both 2D semantic and 3D structural changes in urban environments by fusing multi-temporal satellite imagery with Digital Surface Model data. The method addresses the challenge of combining different data modalities to enable high-frequency urban monitoring and disaster assessment without requiring expensive frequent 3D data collection.

AINeutralarXiv – CS AI · May 116/10

🧠

SAM 3D Animal: Promptable Animal 3D Reconstruction from Images in the Wild

Researchers introduce SAM 3D Animal, a promptable framework for reconstructing multiple animals in 3D from single images, addressing key challenges like occlusion and species variation. The team also releases Herd3D, a new multi-animal dataset with over 5K images, achieving state-of-the-art results across multiple benchmarks.

AIBullisharXiv – CS AI · Apr 66/10

🧠

NavCrafter: Exploring 3D Scenes from a Single Image

NavCrafter is a new AI framework that creates flexible 3D scenes from a single image by generating novel-view video sequences with controllable camera movement. The system uses video diffusion models and enhanced 3D Gaussian Splatting to achieve superior 3D reconstruction and novel-view synthesis under large viewpoint changes.

AINeutralarXiv – CS AI · Mar 176/10

🧠

EgoGrasp: World-Space Hand-Object Interaction Estimation from Egocentric Videos

EgoGrasp introduces the first method to reconstruct world-space hand-object interactions from egocentric videos using open-vocabulary objects. The multi-stage framework combines vision foundation models with body-guided diffusion models to achieve state-of-the-art performance in 3D scene reconstruction and hand pose estimation.

AIBullisharXiv – CS AI · Mar 37/106

🧠

M-Gaussian: An Magnetic Gaussian Framework for Efficient Multi-Stack MRI Reconstruction

Researchers developed M-Gaussian, a new AI framework that adapts 3D Gaussian Splatting for efficient multi-stack MRI reconstruction. The method achieves 40.31 dB PSNR while being 14 times faster than existing implicit neural representation methods, offering improved balance between quality and computational efficiency.

AIBullisharXiv – CS AI · Mar 36/107

🧠

ArtiFixer: Enhancing and Extending 3D Reconstruction with Auto-Regressive Diffusion Models

Researchers propose ArtiFixer, a two-stage pipeline using auto-regressive diffusion models to enhance 3D reconstruction quality. The method addresses scalability and quality issues in existing approaches by training a bidirectional generative model with opacity mixing, then distilling it into a causal auto-regressive model that generates hundreds of frames in a single pass.

AIBullisharXiv – CS AI · Mar 36/104

🧠

MAP-Diff: Multi-Anchor Guided Diffusion for Progressive 3D Whole-Body Low-Dose PET Denoising

Researchers developed MAP-Diff, a multi-anchor guided diffusion framework that improves 3D whole-body PET scan denoising by using intermediate-dose scans as trajectory anchors. The method achieves significant improvements in image quality metrics, increasing PSNR from 42.48 dB to 43.71 dB while reducing radiation exposure for patients.

AIBullisharXiv – CS AI · Mar 26/1017

🧠

LiteReality: Graphics-Ready 3D Scene Reconstruction from RGB-D Scans

Researchers have developed LiteReality, a novel pipeline that converts RGB-D scans of indoor environments into compact, realistic 3D virtual replicas suitable for AR/VR, gaming, robotics, and digital twins. The system features scene understanding, object retrieval, material painting, and physics integration to create graphics-ready environments that support object individuality and physically-based rendering.

AIBullisharXiv – CS AI · Feb 276/107

🧠

AeroDGS: Physically Consistent Dynamic Gaussian Splatting for Single-Sequence Aerial 4D Reconstruction

Researchers have developed AeroDGS, a physics-guided 4D Gaussian splatting framework that enables accurate dynamic scene reconstruction from single-view aerial UAV footage. The system addresses key challenges in monocular aerial reconstruction by incorporating physics-based optimization and geometric constraints to resolve depth ambiguity and improve motion estimation.

AIBullisharXiv – CS AI · Feb 276/108

🧠

Latent Gaussian Splatting for 4D Panoptic Occupancy Tracking

Researchers have developed LaGS (Latent Gaussian Splatting), a new AI method for 4D panoptic occupancy tracking that enables robots to better understand dynamic environments. The approach combines camera-based tracking with 3D occupancy prediction, achieving state-of-the-art performance on industry-standard datasets.

$UNI

AINeutralarXiv – CS AI · Apr 74/10

🧠

TreeGaussian: Tree-Guided Cascaded Contrastive Learning for Hierarchical Consistent 3D Gaussian Scene Segmentation and Understanding

TreeGaussian introduces a new framework for 3D scene understanding that uses tree-guided cascaded contrastive learning to better capture hierarchical semantic relationships in complex 3D environments. The method addresses limitations in existing 3D Gaussian Splatting approaches by implementing structured learning across object-part hierarchies and improving segmentation consistency.

AINeutralarXiv – CS AI · Mar 54/10

🧠

Field imaging framework for morphological characterization of aggregates with computer vision: Algorithms and applications

Researchers developed a comprehensive field imaging framework using computer vision and AI to automatically characterize construction aggregates like sand, gravel, and stone. The system uses 2D image analysis and 3D point cloud reconstruction with machine learning to replace manual inspection methods in construction material assessment.

AINeutralarXiv – CS AI · Mar 54/10

🧠

Improving Multi-View Reconstruction via Texture-Guided Gaussian-Mesh Joint Optimization

Researchers propose a novel framework for 3D object reconstruction from multi-view images that simultaneously optimizes mesh geometry and appearance through Gaussian-guided rendering. The unified approach addresses limitations of existing methods that separate geometry and appearance optimization, enabling better downstream editing tasks like relighting and shape deformation.

AINeutralarXiv – CS AI · Mar 34/103

🧠

CloDS: Visual-Only Unsupervised Cloth Dynamics Learning in Unknown Conditions

Researchers introduce CloDS (Cloth Dynamics Splatting), an unsupervised AI framework that learns cloth dynamics from visual observations without requiring known physical properties. The system uses a three-stage pipeline with dual-position opacity modulation to handle complex cloth deformations and self-occlusions through mesh-based Gaussian splatting.

AIBullisharXiv – CS AI · Mar 34/105

🧠

PPC-MT: Parallel Point Cloud Completion with Mamba-Transformer Hybrid Architecture

Researchers propose PPC-MT, a hybrid Mamba-Transformer architecture for point cloud completion that uses parallel processing guided by Principal Component Analysis. The framework outperforms existing methods on benchmark datasets while maintaining computational efficiency by combining Mamba's linear complexity with Transformer's fine-grained modeling capabilities.

Page 1 of 2Next →