y0news
#low-rank1 article
1 articles
AIBullisharXiv โ€“ CS AI ยท 4h ago8
๐Ÿง 

Taming Momentum: Rethinking Optimizer States Through Low-Rank Approximation

Researchers introduce LoRA-Pre, a memory-efficient optimizer that reduces memory overhead in training large language models by using low-rank approximation of momentum states. The method achieves superior performance on Llama models from 60M to 1B parameters while using only 1/8 the rank of baseline methods.