AIBullisharXiv – CS AI · Mar 56/10
🧠Researchers developed HPENets, a new suite of MLP networks for point cloud processing that uses High-dimensional Positional Encoding (HPE) and non-local MLPs. The approach delivers significant performance improvements while reducing computational costs by 50-80% compared to existing methods across multiple benchmark datasets.
AIBullisharXiv – CS AI · Mar 37/103
🧠Researchers have developed TrajTrack, a new AI framework for 3D object tracking in LiDAR systems that achieves state-of-the-art performance while running at 55 FPS. The system improves tracking precision by 3.02% over existing methods by using historical trajectory data rather than computationally expensive multi-frame point cloud processing.
AINeutralarXiv – CS AI · 3d ago5/10
🧠Researchers introduce xModel-KD, a cross-modal knowledge distillation framework that combines 2D image data with 3D LiDAR point clouds to improve 3D scene segmentation with fewer labeled examples. The method achieves 2% absolute mIoU improvement over LiDAR-only approaches by leveraging complementary strengths of texture and geometric information through contrastive learning.
AINeutralarXiv – CS AI · May 126/10
🧠Researchers have developed an end-to-end deep learning model that reconstructs CAD (Computer-Aided Design) models from point cloud data by segmenting objects into individual extrusions. This approach improves the generalization and robustness of AI models for reverse engineering and quality control applications across manufacturing industries.
AINeutralarXiv – CS AI · May 126/10
🧠Researchers propose HGC-Det, a hyperbolic geometry-based cross-modal distillation framework for 3D object detection that integrates point cloud and image data more effectively. The method addresses modality heterogeneity and spatial misalignment issues through three specialized components and demonstrates improved performance across indoor and outdoor datasets.
AIBullisharXiv – CS AI · Feb 276/105
🧠Researchers introduce SoPE (Spherical Coordinate-based Positional Embedding), a new method that enhances 3D Large Vision-Language Models by mapping point-cloud data into spherical coordinate space. This approach overcomes limitations of existing Rotary Position Embedding (RoPE) by better preserving spatial structures and directional variations in 3D multimodal understanding.
AIBullisharXiv – CS AI · Feb 276/108
🧠Researchers introduce Fase3D, the first encoder-free 3D Large Multimodal Model that uses Fast Fourier Transform to process point cloud data efficiently. The model achieves comparable performance to encoder-based systems while being significantly more computationally efficient through novel tokenization and space-filling curve serialization.
$CRV
AINeutralarXiv – CS AI · Mar 174/10
🧠Researchers propose ConClu, an unsupervised pre-training framework for point clouds that combines contrasting and clustering techniques to learn discriminative representations without labeled data. The method outperforms state-of-the-art approaches on multiple downstream tasks, addressing the challenge of expensive point cloud annotation.
AINeutralarXiv – CS AI · Mar 53/10
🧠Researchers developed a novel neural network architecture for classifying cuneiform tablet metadata using point-cloud representations. The convolution-inspired approach outperformed existing transformer-based methods like Point-BERT by gradually down-scaling point clouds while integrating local and global information.
AIBullisharXiv – CS AI · Mar 34/105
🧠Researchers propose PPC-MT, a hybrid Mamba-Transformer architecture for point cloud completion that uses parallel processing guided by Principal Component Analysis. The framework outperforms existing methods on benchmark datasets while maintaining computational efficiency by combining Mamba's linear complexity with Transformer's fine-grained modeling capabilities.