y0news
AnalyticsDigestsSourcesTopicsRSSAICrypto

#synthetic-datasets News & Analysis

3 articles tagged with #synthetic-datasets. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

3 articles
AIBullisharXiv โ€“ CS AI ยท Apr 137/10
๐Ÿง 

PhysInOne: Visual Physics Learning and Reasoning in One Suite

PhysInOne is a large-scale synthetic dataset containing 2 million videos across 153,810 dynamic 3D scenes designed to address the scarcity of physics-grounded training data for AI systems. The dataset covers 71 physical phenomena and includes comprehensive annotations, demonstrating significant improvements in physics-aware video generation, prediction, and property estimation when used to fine-tune foundation models.

AIBullisharXiv โ€“ CS AI ยท Mar 66/10
๐Ÿง 

Enhancing Zero-shot Commonsense Reasoning by Integrating Visual Knowledge via Machine Imagination

Researchers propose 'Imagine,' a new zero-shot commonsense reasoning framework that enhances Pre-trained Language Models by integrating machine-generated visual signals into the reasoning pipeline. The approach demonstrates superior performance over existing zero-shot methods and even advanced large language models by addressing human reporting biases through machine imagination.

AINeutralarXiv โ€“ CS AI ยท Mar 264/10
๐Ÿง 

Konkani LLM: Multi-Script Instruction Tuning and Evaluation for a Low-Resource Indian Language

Researchers developed Konkani LLM, a specialized language model for the low-resource Indian language Konkani, using a synthetic 100k instruction dataset. The model addresses training data scarcity across multiple scripts (Devanagari, Romi, Kannada) and demonstrates competitive performance against proprietary models in machine translation tasks.

๐Ÿง  Gemini๐Ÿง  Llama