🧠 AI⚪ NeutralImportance 6/10

LASA: A Weak Supervision Method for Open-Vocabulary Scene Sketch Semantic Segmentation

arXiv – CS AI|Liwen Yi, Xianlin Zhang, Yue Zhang, Yue Ming, Xueming Li|June 11, 2026 at 04:00 AM

🤖AI Summary

Researchers introduce LASA, a weak supervision method for open-vocabulary sketch semantic segmentation that aggregates multi-layer Vision Transformer attention maps to capture complementary spatial cues. The approach achieves significant improvements over baselines without requiring pixel-level annotations, advancing computer vision capabilities for sparse line drawing interpretation.

Analysis

LASA addresses a fundamental challenge in sketch understanding: the absence of texture and color information that typically guides semantic segmentation in natural images. The research demonstrates that Vision Transformer layers contain hierarchically organized spatial information—shallow layers preserve global structural context while deeper layers capture local details. By systematically aggregating these complementary representations, LASA creates a more robust framework than relying on single-layer features alone.

The method builds on weak supervision principles, eliminating the need for expensive pixel-level annotations during training. This represents practical progress toward scalable computer vision systems, particularly relevant for applications requiring sketch-based interfaces or rapid annotation workflows. The technical contribution—cross-layer attention aggregation—offers insights applicable beyond sketch segmentation to other vision tasks where structural priors matter.

The experimental validation across three datasets (FS-COCO, SFSD, FrISS) shows consistent, substantial improvements: +3.43 to +15.74 mIoU gains over weakly supervised baselines. These results indicate the approach generalizes across different sketch domains and difficulty levels. The commitment to open-source release enhances reproducibility and adoption potential within the computer vision community.

The research advances open-vocabulary segmentation, enabling systems to work with flexible category vocabularies at inference time without retraining. This flexibility addresses practical deployment scenarios where semantic categories may change. While primarily academic, such advances in efficient vision models support broader adoption of sketch-based interfaces in design tools, game development, and accessibility applications.

Key Takeaways

→LASA aggregates multi-layer Vision Transformer attention to capture complementary spatial information for sketch segmentation without pixel-level annotations
→Cross-layer aggregation provides more robust structural priors than single-layer features, particularly important for texture-free sketch interpretation
→Experimental results show mIoU improvements of 3.43-15.74 across three sketch segmentation benchmarks compared to weakly supervised baselines
→The method enables open-vocabulary segmentation at inference time with flexible category vocabularies, improving practical deployment flexibility
→Publicly available source code facilitates adoption and reproducibility within the computer vision research community

#computer-vision #semantic-segmentation #sketch-understanding #vision-transformers #weak-supervision #deep-learning #open-vocabulary

Read Original →via arXiv – CS AI

Act on this with AI

Stay ahead of the market.

Connect your wallet to an AI agent. It reads balances, proposes swaps and bridges across 15 chains — you keep full control of your keys.

Connect Wallet to AI →How it works

AIMay 6

Your company’s AI could delete everything in 9 seconds. ServiceNow wants to be the kill switch

AIMay 6

Hut 8 (HUT) Stock Soars 37% on Massive $9.8 Billion AI Data Center Agreement

AIMay 6

LASA: A Weak Supervision Method for Open-Vocabulary Scene Sketch Semantic Segmentation

Your company’s AI could delete everything in 9 seconds. ServiceNow wants to be the kill switch

Hut 8 (HUT) Stock Soars 37% on Massive $9.8 Billion AI Data Center Agreement

S&P 500 and NASDAQ hit record highs as AI chip stocks surge