y0news
AnalyticsDigestsSourcesTopicsRSSAICrypto

#dataset-curation News & Analysis

4 articles tagged with #dataset-curation. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

4 articles
AIBearisharXiv – CS AI · May 47/10
🧠

The Algorithmic Gaze of Image Quality Assessment: An Audit and Trace Ethnography of the LAION-Aesthetics Predictor

Researchers audited LAION-Aesthetics Predictor (LAP), an algorithmic model widely used to filter training datasets for visual generative AI systems like Stable Diffusion. The audit reveals LAP systematically biases toward images of women while filtering out men and LGBTQ+ individuals, and reinforces Western artistic preferences, raising critical questions about whose aesthetic values shape AI-generated imagery.

🧠 Stable Diffusion
AIBullisharXiv – CS AI · Mar 47/103
🧠

Skywork-Reward-V2: Scaling Preference Data Curation via Human-AI Synergy

Researchers introduce Skywork-Reward-V2, a suite of AI reward models trained on SynPref-40M, a massive 40-million preference pair dataset created through human-AI collaboration. The models achieve state-of-the-art performance across seven major benchmarks by combining human annotation quality with AI scalability for better preference learning.

AINeutralarXiv – CS AI · Mar 36/107
🧠

Challenges in Enabling Private Data Valuation

Researchers identify fundamental conflicts between data privacy and data valuation methods used in AI training. The study shows that differential privacy requirements often destroy the fine-grained distinctions needed for effective data valuation, particularly for rare or influential examples.

AIBullishOpenAI News · Jun 106/105
🧠

Improving language model behavior by training on a curated dataset

Researchers have discovered that language model behavior can be improved for specific behavioral values through fine-tuning on small, curated datasets. This approach offers a more efficient method for aligning AI models with desired behavioral outcomes without requiring massive training resources.