AIBearisharXiv โ CS AI ยท 4h ago6/10
๐ง
What Is The Political Content in LLMs' Pre- and Post-Training Data?
Research reveals that large language models exhibit political biases stemming from systematically left-leaning training data, with pre-training datasets containing more politically engaged content than post-training data. The study finds strong correlations between political stances in training data and model behavior, with biases persisting across all training stages.