y0news
AnalyticsDigestsSourcesRSSAICrypto
#variance-concentration1 article
1 articles
AIBullisharXiv โ€“ CS AI ยท 7h ago7/10
๐Ÿง 

Residual Stream Analysis of Overfitting And Structural Disruptions

Researchers identified that repetitive safety training data causes large language models to develop false refusals, where benign queries are incorrectly declined. They developed FlowLens, a PCA-based analysis tool, and proposed Variance Concentration Loss (VCL) as a regularization technique that reduces false refusals by over 35 percentage points while maintaining performance.