#pipeline-optimization News & Analysis

3 articles tagged with #pipeline-optimization. AI-curated summaries with sentiment analysis and key takeaways from 50+ sources.

3 articles

AIBullisharXiv – CS AI · Jun 87/10

🧠

DataEvolver: Automatic Data Preparation for Large Language Models through Multi-Level Self-Evolving

DataEvolver is a new self-evolving system that automatically prepares raw data for large language model training by constructing and refining data processing pipelines. The system achieves approximately 10% performance gains on downstream LLM tasks compared to using unprocessed data, reducing the need for expensive manual data curation.

AIBullisharXiv – CS AI · May 287/10

🧠

Mitigating Staleness in Asynchronous Pipeline Parallelism via Basis Rotation

Researchers propose a basis rotation framework to address gradient staleness in asynchronous pipeline parallelism, a technique used for distributed AI training. By aligning the optimizer's coordinate system with the Hessian eigenbasis, the method reduces training iterations by 81.7% compared to existing asynchronous baselines, enabling more efficient large-scale model training.

AIBullisharXiv – CS AI · Apr 106/10

🧠

In-Context Decision Making for Optimizing Complex AutoML Pipelines

Researchers propose PS-PFN, an advanced AutoML method that extends traditional algorithm selection and hyperparameter optimization to handle modern ML pipelines with fine-tuning and ensembling. Using posterior sampling and prior-data fitted networks for in-context learning, the approach outperforms existing bandit and AutoML strategies on benchmark tasks.