y0news
AnalyticsDigestsSourcesTopicsRSSAICrypto

#preconditioning News & Analysis

2 articles tagged with #preconditioning. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

2 articles
AINeutralarXiv – CS AI · Jun 56/10
🧠

PC Layer: Polynomial Weight Preconditioning for Improving LLM Pre-Training

Researchers propose a PC (Preconditioning) layer that uses polynomial weight parameterization to stabilize training of large language models while maintaining computational efficiency. The approach demonstrates performance improvements over standard transformers during Llama-1B pre-training and includes theoretical guarantees for convergence in certain network architectures.

🧠 Llama
AIBullisharXiv – CS AI · May 96/10
🧠

Pro-KLShampoo: Projected KL-Shampoo with Whitening Recovered by Orthogonalization

Researchers introduce Pro-KLShampoo, an improved optimizer for LLM pre-training that combines Kronecker-factored preconditioning with gradient orthogonalization. By exploiting the observed spike-and-flat eigenvalue structure in KL-Shampoo's preconditioners, Pro-KLShampoo achieves better validation loss, reduced memory usage, and faster training across multiple model scales.