Prior Availability in Industrial Visual Sim-to-Real: A Review of CAD-Guided and CAD-Unavailable Regimes
This arXiv paper reviews industrial visual sim-to-real transfer in computer vision, proposing a taxonomy organized by CAD (Computer-Aided Design) data availability. The research distinguishes between CAD-available settings using explicit geometry for rendering and verification, CAD-unavailable settings relying on appearance and feature priors, and hybrid approaches, using benchmark datasets to demonstrate that raw synthetic data volume matters less than source-distribution design, detector capacity, and real-world calibration.
This academic review addresses a critical challenge in industrial automation: the domain gap between synthetic training data and real-world deployment. The paper reframes sim-to-real transfer not as a simple synthetic-to-real translation problem, but as a multifaceted mismatch involving sensors, lighting, materials, calibration variance, and rare failure modes. By organizing the landscape around prior availability—what information (CAD models, reference images, or learned features) grounds the system—the authors create a more nuanced framework than traditional transfer learning literature offers.
The taxonomy bridges two historically separate research communities: 6D pose estimation and object detection literature that assumes full CAD availability, and industrial anomaly detection work that operates without geometric models. This connection is valuable because it reveals common principles across applications. The empirical validation on T-LESS/BOP, MVTec AD, and VisA benchmarks provides concrete evidence that raw synthetic render count does not guarantee transfer success. Instead, careful source-distribution design, appropriate detector architecture, and even modest real-world calibration data often outweigh massive synthetic datasets.
For industrial practitioners, this framing has immediate implications. CAD-available deployment creates distinct verification opportunities through geometric consistency checks (mask, pose, depth alignment), while CAD-unavailable inspection must rely on learned normality calibration and feature-space anomaly detection. The research challenges the assumption that a single leaderboard or metric should evaluate diverse deployment contexts. Organizations deciding between full geometric simulation pipelines versus learned-feature approaches now have a principled framework for cost-benefit analysis, balancing data collection, model complexity, and deployment robustness.
- →CAD-unavailable settings require fundamentally different verification mechanisms than CAD-available deployments, warranting separate evaluation frameworks.
- →Synthetic render volume alone predicts transfer performance poorly; source-distribution design and detector capacity matter more in closing domain gaps.
- →Industrial sim-to-real should be reframed as a prior-availability problem rather than a simple synthetic-to-real classification task.
- →Modest real-world calibration data often produces greater transfer gains than additional synthetic renderings without careful distribution alignment.
- →The research bridges separate literatures on pose estimation and anomaly detection, revealing common principles across industrial visual deployment.