🧠 AI⚪ NeutralImportance 6/10

Closed-Form Linear-Probe Dataset Distillation for Pre-trained Vision Models

arXiv – CS AI|Bincheng Peng, Guang Li, Ping Liu, Takahiro Ogawa, Miki Haseyama|May 11, 2026 at 04:00 AM

🤖AI Summary

Researchers introduce CLP-DD, a novel dataset distillation method optimized for frozen pre-trained vision models using closed-form linear probing. The technique achieves comparable or superior performance to existing methods while running 14x faster and using 87.5% less GPU memory on ImageNet-1K.

Analysis

Dataset distillation addresses a critical challenge in machine learning: compressing large training datasets into compact synthetic equivalents that maintain downstream utility. This research tackles a specific but increasingly common scenario in modern computer vision—transfer learning with frozen pre-trained encoders and lightweight linear probes. Rather than approximating solutions through neural-tangent-kernel assumptions or unrolling iterative update trajectories, CLP-DD exploits the mathematical property that linear probing admits exact closed-form solutions.

The efficiency gains stem from eliminating computational overhead associated with trajectory-based approaches and infinite-width approximations. By directly leveraging pre-trained feature geometry, the method reduces the bilevel optimization problem to kernel ridge regression in sample space, with synthetic image updates guided by temperature-scaled softmax cross-entropy loss. Critically, the researchers demonstrate that outer objective selection dramatically influences performance—MSE-based losses underperform significantly compared to discriminative objectives that treat classifier columns as learned class anchors.

These efficiency improvements carry practical significance for deployed systems. Running 14x faster with one-eighth the memory overhead enables dataset distillation at scale, benefiting edge deployment, federated learning scenarios, and resource-constrained environments. The method's performance parity or superiority across ImageNet-1K benchmarks suggests broader applicability across vision architectures. For practitioners developing efficient transfer learning pipelines, this represents a meaningful advancement in reducing computational barriers to dataset compression techniques previously accessible only to well-resourced research teams.

Key Takeaways

→CLP-DD exploits closed-form linear-probe solutions to eliminate trajectory-unrolling overhead in dataset distillation.
→The method achieves 14x speedup and 87.5% memory reduction compared to existing state-of-the-art approaches.
→Outer objective design proves critical—discriminative losses substantially outperform standard MSE formulations.
→Performance matches or exceeds LGM with DSA on three of four tested backbones on ImageNet-1K.
→Practical efficiency gains enable dataset distillation deployment in resource-constrained transfer learning scenarios.