AINeutralarXiv – CS AI · 10h ago4/10
🧠
Improving Text-to-Music Generation with Human Preference Rewards
Researchers submitted an entry to an academic text-to-music generation challenge using a learned human-preference reward system called TuneJury to improve model outputs. The approach combines five engineering optimizations on a 120M-parameter FluxAudio-S backbone, including reward conditioning, architectural sweeps, expert iteration, preference tuning, and inference post-processing.