y0news
AnalyticsDigestsRSSAICrypto
#activation-analysis1 article
1 articles
AINeutralarXiv โ€“ CS AI ยท 5h ago0
๐Ÿง 

Narrow Finetuning Leaves Clearly Readable Traces in Activation Differences

Researchers found that narrow finetuning of Large Language Models leaves detectable traces in model activations that can reveal information about the training domain. The study demonstrates that these biases can be used to understand what data was used for finetuning and suggests mixing pretraining data into finetuning to reduce these traces.