GaMi: Geometry-Agnostic Material Identification via Cross-Modal Subtractive Disentanglement
GaMi is a multimodal material identification system that combines mmWave and acoustic sensing to accurately identify materials regardless of geometric variations like shape, orientation, and distance. Using cross-modal subtractive disentanglement and contrastive learning, the system achieves 95.2% accuracy on 20 materials and demonstrates few-shot generalization across different devices.
GaMi addresses a fundamental challenge in embodied AI and robotics: reliable material identification without physical contact. Traditional single-sensor approaches struggle when geometric factors—object orientation, surface curvature, sensor distance—introduce noise into measurements. The research leverages an elegant insight: when two sensors (mmWave radar and acoustic) observe the same object simultaneously, geometric variations affect both modalities identically, while material properties create unique signatures in each. By isolating shared geometric interference and subtracting it, GaMi extracts pure material features that remain consistent across viewing angles and distances. This represents a meaningful advance in robotic perception, enabling machines to adapt their interaction strategies based on material properties—critical for tasks like grasping fragile objects, applying appropriate force, or selecting suitable tools. The multimodal approach outperforms single-sensor baselines precisely because it exploits complementary information: acoustic sensing captures material resonance and damping, while mmWave detects dielectric properties. The few-shot adaptation mechanism addresses practical deployment concerns, allowing models trained on one device to quickly generalize to slightly different hardware without extensive retraining. Industrial applications span manufacturing quality control, robotic manipulation, and autonomous systems requiring real-time material feedback. The 95.2% accuracy across 20 materials demonstrates practical utility, though scalability to hundreds of material variants remains unexplored. This work exemplifies how multimodal sensor fusion and sophisticated disentanglement techniques can overcome geometric confounds—a pattern increasingly valuable as embodied AI systems require more robust perception in unstructured environments.
- →GaMi achieves 95.2% material identification accuracy by combining mmWave and acoustic sensing under varied geometric conditions.
- →Cross-modal subtractive disentanglement isolates intrinsic material features by removing shared geometric interference between sensors.
- →Few-shot adaptation enables rapid generalization across different devices with minimal retraining.
- →Multimodal fusion outperforms single-sensor baselines by leveraging complementary physical properties of materials.
- →The system enables adaptive robotic interaction strategies based on material identification without contact requirements.