AINeutralarXiv – CS AI · 6h ago6/10
🧠
Subliminal Learning is a LoRA Artifact
Researchers demonstrate that subliminal learning—where language models transmit behavioral traits through seemingly neutral data—is actually a fragile artifact of LoRA fine-tuning rather than a genuine learning phenomenon. The transmission effect disappears with full model fine-tuning and depends heavily on specific context present during both training and evaluation, suggesting it represents an unstable channel for behavioral transfer.