y0news
AnalyticsDigestsSourcesTopicsRSSAICrypto

#data-augmentation News & Analysis

8 articles tagged with #data-augmentation. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

8 articles
AIBullisharXiv โ€“ CS AI ยท Mar 116/10
๐Ÿง 

Grounding Synthetic Data Generation With Vision and Language Models

Researchers introduce ARAS400k, a large-scale remote sensing dataset containing 400k images (100k real, 300k synthetic) with segmentation maps and descriptions. The study demonstrates that combining real and synthetic data consistently outperforms training on real data alone for semantic segmentation and image captioning tasks.

AIBearisharXiv โ€“ CS AI ยท Mar 36/106
๐Ÿง 

LangGap: Diagnosing and Closing the Language Gap in Vision-Language-Action Models

Researchers reveal that state-of-the-art Vision-Language-Action (VLA) models largely ignore language instructions despite achieving 95% success on standard benchmarks. The new LangGap benchmark exposes significant language understanding deficits, with targeted data augmentation only partially addressing the fundamental challenge of diverse instruction comprehension.

AIBullisharXiv โ€“ CS AI ยท Mar 36/104
๐Ÿง 

Augmenting Research Ideation with Data: An Empirical Investigation in Social Science

Researchers developed a framework that improves AI-generated research ideas by incorporating relevant data during the ideation process. The approach increased idea feasibility by 20% and overall quality by 7%, with human studies confirming that data-augmented AI assistance helps researchers generate higher-quality ideas.

AINeutralarXiv โ€“ CS AI ยท Mar 26/1019
๐Ÿง 

BRIDGE the Gap: Mitigating Bias Amplification in Automated Scoring of English Language Learners via Inter-group Data Augmentation

Researchers developed BRIDGE, a framework to reduce bias in AI-powered automated scoring systems that unfairly penalize English Language Learners (ELLs). The system addresses representation bias by generating synthetic high-scoring ELL samples, achieving fairness improvements comparable to using additional human data while maintaining overall performance.

AINeutralarXiv โ€“ CS AI ยท Apr 65/10
๐Ÿง 

Generating Satellite Imagery Data for Wildfire Detection through Mask-Conditioned Generative AI

Researchers developed a generative AI approach using EarthSynth to create synthetic post-wildfire satellite imagery for training deep learning wildfire detection systems. The study found that inpainting-based pipelines significantly outperformed full-tile generation, achieving better spatial alignment and burn area detection accuracy.

AINeutralarXiv โ€“ CS AI ยท Mar 34/104
๐Ÿง 

Data-Augmented Deep Learning for Downhole Depth Sensing and Validation

Researchers developed a data-augmented deep learning system for accurate downhole depth sensing in oil and gas wells using casing collar locator (CCL) technology. The system addresses limited real well data challenges through comprehensive preprocessing methods, achieving F1 score improvements of up to 0.057 for collar recognition models.

AINeutralarXiv โ€“ CS AI ยท Feb 274/103
๐Ÿง 

TabDLM: Free-Form Tabular Data Generation via Joint Numerical-Language Diffusion

Researchers introduce TabDLM, a new AI framework that generates synthetic tabular data containing both numerical values and free-form text using joint numerical-language diffusion models. The approach addresses limitations of existing diffusion and LLM-based methods by combining masked diffusion for text with continuous diffusion for numbers, enabling better synthetic data generation for privacy and data augmentation applications.