y0news
AnalyticsDigestsSourcesTopicsRSSAICrypto

#parameter-reduction News & Analysis

5 articles tagged with #parameter-reduction. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

5 articles
AIBullisharXiv – CS AI · May 127/10
🧠

LoopVLA: Learning Sufficiency in Recurrent Refinement for Vision-Language-Action Models

LoopVLA introduces a recurrent Vision-Language-Action model architecture that learns when to stop refining representations for robotic control tasks, achieving 45% parameter reduction and 1.7x faster inference while maintaining or improving task performance. The model uses self-supervised learning to estimate representation sufficiency rather than relying on predefined layer depths or heuristic rules.

AINeutralarXiv – CS AI · Jun 96/10
🧠

Structured Neuron Pruning in Deep Neural Networks Using Multi-Armed Bandits

Researchers present a novel structured pruning framework that uses multi-armed bandit algorithms to remove redundant neurons from deep neural networks. The approach treats each neuron as a bandit arm, testing its importance through temporary masking and loss measurement, then applies various MAB policies (UCB1, Thompson Sampling, etc.) to identify which neurons to prune. Experiments across tabular and deep learning tasks show MAB-based pruning significantly outperforms traditional magnitude-based and greedy pruning methods.

AIBullisharXiv – CS AI · May 96/10
🧠

Is One Layer Enough? Understanding Inference Dynamics in Tabular Foundation Models

Researchers conducted the first large-scale mechanistic study of tabular foundation models, revealing significant redundancy across inference layers. They demonstrated that a single-layer looped model can match performance of state-of-the-art models while using only 20% of the parameters, challenging assumptions about depth requirements in transformer architectures.

AIBullisharXiv – CS AI · Apr 76/10
🧠

Training Transformers in Cosine Coefficient Space

Researchers developed a new method to train transformer neural networks using discrete cosine transform (DCT) coefficients, achieving the same performance while using only 52% of the parameters. The technique requires no architectural changes and simply replaces standard linear layers with spectral layers that store DCT coefficients instead of full weight matrices.

🏢 Perplexity
AINeutralHugging Face Blog · Jan 235/106
🧠

SmolVLM Grows Smaller – Introducing the 256M & 500M Models!

SmolVLM has released smaller versions of their vision-language model with 256M and 500M parameter variants. The article title suggests these are more compact versions of their existing AI model, potentially making the technology more accessible and efficient for various applications.