AINeutralarXiv – CS AI · 11h ago6/10
🧠
EPSVec: Efficient and Private Synthetic Data Generation via Dataset Vectors
Researchers introduce EPSVec, a differentially-private method for generating synthetic data using large language models that operates significantly more efficiently than existing approaches. By using dataset vectors to steer LLM generation, the technique decouples privacy costs from the number of synthetic samples generated, enabling high-quality synthetic data creation even with limited private datasets.