AIBullisharXiv โ CS AI ยท 5h ago0
๐ง
Harmonizing Dense and Sparse Signals in Multi-turn RL: Dual-Horizon Credit Assignment for Industrial Sales Agents
Researchers propose Dual-Horizon Credit Assignment (DuCA), a new framework for optimizing large language models in industrial sales applications. The method addresses training instability by separately normalizing short-term linguistic rewards and long-term commercial rewards, achieving 6.82% improvement in conversion rates while reducing repetition and detection issues.