y0news
AnalyticsDigestsSourcesRSSAICrypto
#cross-document-reasoning1 article
1 articles
AIBullisharXiv โ€“ CS AI ยท 7h ago7/10
๐Ÿง 

WRAP++: Web discoveRy Amplified Pretraining

WRAP++ is a new pretraining technique that enhances language model training by discovering cross-document relationships through web hyperlinks and synthesizing multi-document question-answer pairs. By amplifying ~8.4B tokens into 80B tokens of relational QA data, the method enables models like OLMo to achieve significant performance improvements on factual retrieval tasks compared to single-document approaches.