🧠 AI⚪ NeutralImportance 5/10

Failure-Aware Refinement of Vision-Language Model for Lithography Defect Detection

arXiv – CS AI|Pangyun Jeong, Jiyeong Kong, Yuehua Hu, Dohee Jeong, Kyung-Tae Kang|June 9, 2026 at 04:00 AM

🤖AI Summary

Researchers propose a two-stage vision-language framework using Qwen3-VL with LoRA fine-tuning to detect semiconductor lithography defects, then employ a refinement module trained on first-stage failures to improve accuracy beyond standard single-stage approaches.

Analysis

This research addresses a critical challenge in semiconductor manufacturing where detecting microscopic pattern defects directly impacts production quality and yield. The proposed approach leverages vision-language models—systems trained on both visual and textual data—to identify and classify lithography defects like bridges, burrs, and contamination in inspection images. The innovation lies in its two-stage architecture, where initial predictions undergo systematic refinement based on learned failure patterns.

Semiconductor inspection automation has grown increasingly important as feature sizes shrink and defect detection becomes more complex. Traditional computer vision methods struggle with subtle defects and edge cases. Vision-language models offer advantages by combining visual understanding with contextual reasoning, though they require substantial computational resources and careful training strategies. The use of LoRA (Low-Rank Adaptation) as a fine-tuning mechanism enables efficient model customization without full retraining.

The refinement stage represents the key technical contribution—rather than accepting initial model outputs, the system explicitly learns to correct common errors by training on failure cases. This failure-aware approach mirrors human inspection workflows where experienced technicians review questionable detections. For semiconductor manufacturers, improved defect detection translates to reduced escaped defects reaching customers, lower warranty costs, and enhanced reputation in competitive markets.

The framework's practical deployment depends on validation across diverse lithography processes and fab environments. Semiconductor equipment suppliers and manufacturers integrating this technology could achieve competitive advantages through higher inspection accuracy and faster processing times compared to purely manual or traditional algorithm-based systems.

Key Takeaways

→Two-stage vision-language framework combines initial Qwen3-VL defect detection with a learned refinement module for improved accuracy
→Failure-aware training on first-stage prediction errors enables the model to correct false positives, missed defects, and misclassifications
→LoRA fine-tuning approach reduces computational overhead while customizing the model for semiconductor lithography applications
→Semiconductor manufacturers could achieve cost savings through reduced escaped defects and faster automated inspection cycles
→Framework addresses limitations of single-stage approaches by explicitly learning from error patterns rather than avoiding them

#vision-language-models #semiconductor-inspection #lithography-defects #machine-learning #qwen3-vl #fine-tuning #quality-control #manufacturing-automation

Read Original →via arXiv – CS AI

Act on this with AI

Stay ahead of the market.

Connect your wallet to an AI agent. It reads balances, proposes swaps and bridges across 15 chains — you keep full control of your keys.

Connect Wallet to AI →How it works

AIMay 6

Your company’s AI could delete everything in 9 seconds. ServiceNow wants to be the kill switch

AIMay 6

Hut 8 (HUT) Stock Soars 37% on Massive $9.8 Billion AI Data Center Agreement

AIMay 6

Failure-Aware Refinement of Vision-Language Model for Lithography Defect Detection

Your company’s AI could delete everything in 9 seconds. ServiceNow wants to be the kill switch

Hut 8 (HUT) Stock Soars 37% on Massive $9.8 Billion AI Data Center Agreement

S&P 500 and NASDAQ hit record highs as AI chip stocks surge