AINeutralarXiv โ CS AI ยท 6h ago1
๐ง
Summer-22B: A Systematic Approach to Dataset Engineering and Training at Scale for Video Foundation Model
Researchers documented their experience training Summer-22B, a video foundation model developed from scratch using 50 million clips. The report details engineering challenges, dataset curation methods, and architectural decisions, emphasizing that dataset engineering consumed the majority of development effort.