y0news
#synthetic-data6 articles
6 articles
AIBullisharXiv โ€“ CS AI ยท 4h ago5
๐Ÿง 

SALIENT: Frequency-Aware Paired Diffusion for Controllable Long-Tail CT Detection

Researchers introduce SALIENT, a frequency-aware diffusion model framework that improves detection of rare lesions in CT scans by generating synthetic training data in wavelet domain rather than pixel space. The approach addresses extreme class imbalance in medical imaging through controllable augmentation, achieving significant improvements in detection performance for low-prevalence conditions.

AINeutralarXiv โ€“ CS AI ยท 4h ago8
๐Ÿง 

BRIDGE the Gap: Mitigating Bias Amplification in Automated Scoring of English Language Learners via Inter-group Data Augmentation

Researchers developed BRIDGE, a framework to reduce bias in AI-powered automated scoring systems that unfairly penalize English Language Learners (ELLs). The system addresses representation bias by generating synthetic high-scoring ELL samples, achieving fairness improvements comparable to using additional human data while maintaining overall performance.

AIBullisharXiv โ€“ CS AI ยท 4h ago6
๐Ÿง 

LLM-Driven Multi-Turn Task-Oriented Dialogue Synthesis for Realistic Reasoning

Researchers propose an LLM-driven framework for generating multi-turn task-oriented dialogues to create more realistic reasoning benchmarks. The framework addresses limitations in current AI evaluation methods by producing synthetic datasets that better reflect real-world complexity and contextual coherence.

AIBullisharXiv โ€“ CS AI ยท 4h ago4
๐Ÿง 

TradeFM: A Generative Foundation Model for Trade-flow and Market Microstructure

Researchers introduced TradeFM, a 524M-parameter generative AI model that learns from billions of trade events across 9,000+ equities to understand market microstructure. The model can generate synthetic market data and generalizes across different markets without asset-specific calibration, potentially enabling new applications in trading and market simulation.

$COMP
AIBullisharXiv โ€“ CS AI ยท 4h ago1
๐Ÿง 

ProductResearch: Training E-Commerce Deep Research Agents via Multi-Agent Synthetic Trajectory Distillation

Researchers developed ProductResearch, a multi-agent AI framework that creates synthetic training data to improve e-commerce shopping agents. The system uses multiple AI agents to generate comprehensive product research trajectories, with experiments showing a compact model fine-tuned on this synthetic data significantly outperforming base models in shopping assistance tasks.

AINeutralarXiv โ€“ CS AI ยท 4h ago1
๐Ÿง 

Modelling and Simulation of Neuromorphic Datasets for Anomaly Detection in Computer Vision

Researchers introduce ANTShapes, a Unity-based simulation framework that generates synthetic neuromorphic vision datasets to address the scarcity of Dynamic Vision Sensor data. The tool creates configurable 3D scenes with randomly-behaving objects for training anomaly detection and object recognition systems in event-based computer vision.